Dupes be gone!





Duplicate items are an RSS aggregator’s worst enemy, and many of the dedicated folks who are using Outlook 2007 Beta 2 know we did not do a great job in that build of handling the many ways that dupes can occur.


Since the Beta 2 build we’ve made numerous improvements to the RSS architecture around our ability to deal with duplicate items. This includes changes in both the individual download logic for feeds, the server sync if you’re in an Exchange environment, and the delete behavior for individual items.


When you delete an individual RSS item from the feed’s folder in Outlook 2007, we take it as “I’m done with this item and don’t want to see it again.” This means if the post continues to exist in the XML file we get from the content publisher for another few days (or however long it takes to roll off the end of the file), we will not download it again. Read Status is also handled the same way; mark an item as Read and its status will not change in this scenario.


If a blogger or content publisher modifies a post and wants their readers to be sure they see it again, they should follow the best practice of re-posting the new content. This will create a new GUID and cause Outlook (and other aggregators that follow this delete model) to see it as a new item and download it as appropriate.


Minor or non-content changes made to existing items in the feed’s XML – especially random tags used by a specific aggregator or inserted automatically by the syndication engine – will not cause Outlook to see it as a new item and download a duplicate. We saw a large number of duplicate feed items in Beta 2 because of this and our improvements to the update logic for individual posts is designed to handle this. The specific logic for determining which fields to use for change detection in Outlook is now the same as IE 7.


 

Comments (18)

  1. Dan Dautrich says:

    Hooray!  I’ve noticed the "non-content changes" showing up as unread posts in my own blog feed (when I post new items and older items get the "recent posts" links updated).  Seeing 3 or 4 identical copies per post on my subscribed Microsoft blogs isn’t out of the ordinary either, so this comes as a welcome improvement.  Keep up the great work, guys.

  2. Kevin Dente says:

    Have you done a survey of blogging engines to see how they behave relative to post changes?

    This is definitely a tricky area, and one where different aggregators behave quite differently. IIRC, Newsgator Inbox actually uses hashes to detect post changes, which can definitely result in false positives but ensures that I see post updates that other aggregators (eg RSS Bandit) would ignore.

    It’s a fine line, I guess, between too much noise and missing updates. I tend to prefer Newsgators approach myself, as I don’t like to miss potentially important updates.

  3. Jorgen says:

    Michael,

    Please note that this also occures when you move a RSS feed delivery location from an ost folder to a pst folder. When doing that you receive all the RSS feeds once again…

  4. Michael Affronti explains how the next builds of Outlook 2007 fix major issues with feed duplicates appearing

  5. There’s been some conversations going on about how Outlook is handling duplicate RSS items and what that…

  6. nick gogerty says:

    Our free Outlook reader inclue!  supports video today http://mtadmin1.mailtail.com/video/inclueRSSFox.wmv

    We also have a one click feed discovery and add button

    http://www.inclue.com/incluebutton

    We handle dupes in the same way as stated above, but are working on doing things at the item level.  

    We launch a free version for outlook express in 3 weeks.

  7. Last night, I took the plunge: I’m now using Office 2007 Beta 2 on all of my computers.

    The last stronghold…

  8. gay sex says:

    gay men masturbating
    galleries of very young boys in speedos
    teen boy penis
    the boys of summer
    frat boys uncensored

  9. gay sex says:

    naked male in yahoo
    gay porn post
    yaoi gay
    free gay black porn
    gay cum face

  10. gay sex pics says:

    free gay cinema
    training sissy boys
    free galleries young boys
    women with strap on dicks
    nude school boys

  11. gay sex pics says:

    black dick white pussy
    gay sex techniques
    boys wearing bikini briefs
    croatia naturist boys
    gay cartoon gallery

  12. gay cum eater
    dad son sex gallery
    puberty in boys teen sexuality
    gay hot men
    swimming boys underwear

  13. With the Office Beta 2 Technical Refresh now live on the web, I wanted to take a few minutes and…