Home / Products / Power Search / Extract Website Data
This tutorial describes how to create a Power Search query that will capture the Product Name, Product Quantity, Product Code, and Product Description of all the items on the Power Search demo site: powersearchdemo.inspyder.com
Open Internet Explorer and navigate to the demo site, http://powersearchdemo.inspyder.com.
Right click on the page and click "View Source" or "View Page Source". This will open up the underlying HTML of the page.
Locate the code that displays the item details. (Look for the product name or other unique text that is found on this page that you wish to capture.)
Select the HTML code that surrounds the text you wish to extract. Copy that code into notepad. Below you can see that we've copied everything between the "table" tags.
Query strings can get some-what messy so the first thing we do is start off with a simple query string that captures only the name of the product. From the HTML source above we can see that the product name follows a unique pattern. If structured consistently throughout the website, it can be identified by the CSS class name "pName". Therefore, our query string must be long enough to include that unique identifier.
The next thing we do is replace the current product name, which changes from one product to another with a Wildcard Match identifier. The Wildcard Match identifier is the text that changes arbitrarily between a opening and close tag.
In this example we change the text "<td>Animal A</td>" to "<td>#Name#</td>". Below is the corresponding query string which will capture the name of all the products:
<td class="pname">Product Name:</td>*<td>#Name#</td>
Now we will expand the query string to capture both the product name and product code. The query string for capturing the product quantity is similar to the one for the product name. However, instead of the CSS class name "pName", the CSS class name is "pCode" uniquely identifies the product quantity. We are not interested in any text after the </td> tag and beginning <td> tag, so we will add a Wildcard Character.
Below is the corresponding query string which will capture the name and quantity of all the products:
<td class="pName">Product Name:</td>*<td>#Name#</td>*<td class="pCode">Product Code:</td>*<td>#Code#</td>
We can expand this to include the product name, code, quantity, and description:
<td class="pName">Product Name:</td>*<td>#Name#</td>*<td class="pCode">Product Code:</td>*<td>#Code#</td>*<td class="pQuantity">Product Quantity:</td>*<td>#Qnty#</td>*<td class="pDescription">Product Description:</td>*<td>#Description#</td>
Open Inspyder Power Search and copy the query string that was created above into the Query text-box.
Setting the Project Settings:
Setting Query Options:
Search Result: