Introduction
Hi, I am Akira, your guide at Data Without Code. Let me ask you a question: Have you ever tried dragging a VLOOKUP formula down 100,000 rows in Excel, only to watch your screen freeze and the application crash?
As a DX manager who spent years in non-tech roles, I have lost count of how many hours I wasted waiting for Excel to calculate massive VLOOKUPs. When I finally transitioned to no-code data tools, learning how to replicate this exact function was my number one priority.
If you have already followed my guide on how to install KNIME, you are ready for the magic. In this tutorial, I will show you how to do a VLOOKUP in KNIME in seconds using the Joiner node.
VLOOKUP in KNIME: Meet the Joiner Node
In the world of databases and data science, a VLOOKUP is actually called a “Join.” It simply means bringing two different tables together based on a matching column (like a Product ID or Employee ID).
In KNIME Analytics Platform, we do this using the Joiner node. It is incredibly fast, does not crash with millions of rows, and completely eliminates those annoying #N/A errors.
Step-by-Step Guide to Using the Joiner Node
Let’s imagine a classic business scenario: You have a table of Sales Data (with Product IDs but no product names), and a Master Product List (which has the Product IDs and their corresponding names). We need to bring the product names into the sales data.
Step 1: Read Your Data
First, you need to bring both tables into your KNIME workspace. You will use two Excel Reader nodes (or CSV Reader nodes). Connect one to your Sales Data and the other to your Master Product List.
(Tip: If you are not sure how to import files yet, you can learn the basics in my upcoming guide on how to read and write CSV files automatically in KNIME.)
Step 2: Add and Connect the Joiner Node
Search for the Joiner node in the Node Repository (bottom-left panel) and drag it onto your canvas. You will notice that the Joiner node has two input ports on the left side.
- Connect your Sales Data to the Top Input Port (Think of this as your main table).
- Connect your Master Product List to the Bottom Input Port (Think of this as your lookup table).
Step 3: Configure the Matching Columns
Double-click the Joiner node to open the configuration menu. The first tab is called “Joiner Settings”. Here, you need to tell KNIME which columns to match.
Select “Product ID” from the Top Input (Sales Data) and match it to “Product ID” from the Bottom Input (Master Product List). This is exactly the same as selecting your lookup value in Excel!
Step 4: Select “Left Outer Join” (The VLOOKUP Equivalent)
This is the most crucial step for beginners. In the middle of the configuration window, you will see different “Join Modes”.
To replicate a standard Excel VLOOKUP perfectly, you must select Left Outer Join. This tells KNIME: “Keep all the rows in my main Sales Data, and only bring in the matching information from the Master Product List.”
Step 5: Choose Your Columns
Finally, go to the “Column Selection” tab. Here, you can uncheck any columns you do not need from the Master Product List. For example, you might only want to bring in the “Product Name” and ignore the rest. Click OK, and then execute the node (press F8 or click the green play button).
Right-click the executed Joiner node and select “Joined table” to see your results. Boom! You just did a VLOOKUP on thousands of rows instantly.
Why the Joiner Node is Better Than Excel
Using the Joiner node instead of Excel formulas has massive benefits for business automation:
- Speed: It processes millions of rows in seconds.
- No more broken formulas: If a lookup value is missing, KNIME simply leaves it as a null value (represented by a red question mark
?) without breaking your entire sheet. - Visual Traceability: Anyone on your team can look at your canvas and immediately understand where the data came from. For more on this, check out my visual guide to understanding KNIME nodes.
Conclusion: Your Next Steps
Mastering the Joiner node is your first major milestone in leaving manual Excel work behind. As a DX manager, I can confidently say that once you stop using VLOOKUPs and start using Joins, you will never look back.
Now that your data is combined, the next logical step is usually cleaning it up. Are you ready to refine your dataset? Join me in the next tutorial where I explain how to filter rows and remove duplicates easily in KNIME!
