The Most Active and Friendliest
Affiliate Marketing Community Online!

“Adavice”/  “CPA

Need Advice on Datafeeds

Phew - I'm certainly glad I pulled Carsten in on that one :)

I realized that the original question was not really answered at your site, but you still have one of the most comprehensive and understandable resources out there on the whole subject.

Sorry I created so much work for you! Thanks for helping out, and clarifying so much in your posts - I'm sure this thread will now be a great help to many forum visitors.

Mo

Hehe ... no problem. Keep in mind that I also spent the time to put all the stuff up at Cumbrowski.com. I don't make my living off that site, it's a pet project for me and a way of making things accessible and it allows me to just throw links at somebody instead of writing everything again and again.

The discussion here was also interesting for me and I am sure that I will use some of the stuff that came out of here to improve and extend my resources at my site. See, I got something out of it too. hehe.
 
- Get at least some programming skills and also some experience with database software (MySQL or MS SQL Server (which I use))
- Get familiar with delimited text files and their problems. You can use Excels import feature to import feeds. There you will see fairly quickly what some of the issues are
- never assume that the feed file is okay and that the columns are used for what they are supposed to be used
- Get familiar with FTP (basic FTP that comes with Windows is all you need)
- Get familiar with XMLHTTP to pull data via HTTP
- Get familiar with data validation and data normalization principles and problems
- Learn how to schedule things for automation. I use SQL Server Agent to schedule things, but the Windows scheduler would work for a lot of things too
- Basic XML knowledge would not hurt either
- Get familiar with web services and use them whenever you can.
- Amazon's and eBay's APIs are good for learning and getting experiences with web services

Carsten, Thanks again......!

Could you explain more about XML and how to use XML datafeeds?

1) Delete all records related to the feed in the database and then import it as if it was new. Depending on the feed is the key for the deletion filter usually the merchant ID on the network where you pull the feed from. The problem is that you can't use your own primary key for products in this case, because the key values would change and all your references to the old records would become invalid

2) Check if the product exists in the DB by doing a look-up with a unique identifier for the Merchant+Product combination and do an update if it exists (have a date stamp for "last updated" and update that as well). Insert all records you can't find in the DB. Delete all records where the "last updated" stamp was not updated. This requires more logic for the import script and takes longer to import data.

Hope that helps
Cheers

Interesting! So here you are working directly withing the MySQL database. I suppose you have to either work fast or shut your site down while you are doing this work.....wouldn't you?

Charlie
 
Carsten, Might I ask you about scripts such as Associate Engine that use perl and cgi to allow one to put dynamic datafeeds on you site?

Can you help me to understand how this program works in easy terms....;)

Does this type of program use XML or API's (do these terms mean basically the same thing?)

Associate Engine allows you a lot of control over how the feed is presented on your page and it pulls in the datafeed in real time and generates it only when the information is requested.

Would this not be the kind of datafeed model that one would want from other Merchant's datafeeds, and why is this not provided more often?

Or am I misunderstanding this program?

Thanks

Charlie
 
I use data feeds extensively on my sites coupled with MYSQL.
If you would like the easy route to set up an affiliate shop then I would recommend looking at Affilistor with an easy to use CSV import script.
the problems start if you are using SEO for marketing if a data feed needs updating a lot then any pages indexed will have changed by the time anyone finds it!
 
Steve, thanks I have just recently been looking at that site from a link on Carsten's page....I appreciate it and will check it out.

Charlie
 
Carsten, Thanks again......!

Could you explain more about XML and how to use XML datafeeds?

See
XML Datafeeds and Webservices for Affiliates

Interesting! So here you are working directly withing the MySQL database. I suppose you have to either work fast or shut your site down while you are doing this work.....wouldn't you?

Charlie

What I described was the basic concept. There are a number of factors that determine the details. The work with data feeds is in fact very related to what organic web search engines do. The article "Why Writing Your Own Search Engine is Hard" at "Queue" magazine by Anna Patterson, who is a former Google engineer, is providing some very good insights.

ACM Queue - Why Writing Your Own Search Engine is Hard: So you have a grand idea; are you ready for the execution?

Those things become important when you start dealing with millions of products from hundreds of merchants rather than a couple hundred thousand from a handful of merchants.

Silencio said:
Carsten, Might I ask you about scripts such as Associate Engine that use perl and cgi to allow one to put dynamic datafeeds on you site?

Can you help me to understand how this program works in easy terms....

"Associate Engine" rings a bell, but I am not an expert when it comes to Perl and cgi. I used Perl code and even changed it, but are not a Perl programmer. I know that Perl is fast and powerful for parsing text data, faster than any ASP/NET/PHP code possibly could. That frees some resources, but also has its limits.

I have most experience with the Microsoft platform. I use MS SQL Server and know a lot more about that than I do know about MySQL.

Silencio said:
Does this type of program use XML or API's (do these terms mean basically the same thing?)

What you use on your end depends on what is available on the providers end. If they dump a pipe delimited text file compressed with gzip into a folder on one of their servers and the only access to that file is via FTP, you can't do much with web services there. Amazon for example have their web services and no public access to feeds in file format (because its too much data, roughly a single layer DVD full with Gzip compressed XML files).

If you want to get product data from wherever you can, you have to support all types of formats and delivery methods, plus some intermediate steps, such as compression (or decompression) of ZIP or GZIP archives.
That's why did I mention that you should familiarize yourself with

- FTP (send and retrieve)
- HTTP (XMLHTTP)
- Delimited Text Files (CSV, Tab Delimited, Pipe Delimited etc.)
- XML (The general Markup)
- Web Services (SOAP/WSDL, REST, XML-RPC, SAX etc.)
- GZIP (and ZIP)

Everything i have seen so far only involve one or many of those, but I have seen a great number of different combinations of them.



Silencio said:
Associate Engine allows you a lot of control over how the feed is presented on your page and it pulls in the datafeed in real time and generates it only when the information is requested.

Would this not be the kind of datafeed model that one would want from other Merchant's datafeeds, and why is this not provided more often?

Or am I misunderstanding this program?

If you pull full feeds in real-time, then you are wasting an awful lot of resources to process a large amount of data in real time to return only a few.

This is where web services come in handy. The main work load is on the providers end and not yours. You don't have to keep all the product data in your own database, although I recommend looking into "caching" anyway for improved performance. You don't want to make a call to the web service provided by the merchant or network if the user only refreshes the page.

I hope this makes sense.

Regarding Steve's comments to Affilistore.

Affilistore just launched a beta of their tool in January this year. I have not checked since then, but from what Steve says, did they seem to make good progress there. A tool that can import product feeds and update your data base on an ongoing basis is already complicated and resources intensive. Once you have the data in your DB, the real work starts, because you have to think about what you are going to do with them. Just cranking out HTML pages with the feed content is not going to cut it.

Even if you are ranking with those pages today, chances are that you won't tomorrow, because search engines try to get rid of this. If you are in the game for the long run, think long about how you can apply the data in a useful way that creates additional value beyond the raw feed data themselves. Creating spam pages is a "hit and run" approach, which becomes more and more a "miss and run" approach, with the advancements search engines are making to make you "miss".

Cheers!
 
MI
Back