So it turns out this is kind of a hard problem, actually .. when you import into a soup, it doesn't actually just add directly to the soup, rather it just adds to the database and the soup is updated in the background. That means that to do this we really need the more general deduping at the database level (which is a somewhat substantial amount of work).