Skip to main content
MANIFOLD
Make mana!! Find me a big genealogical database (anonymised is okay) with 10m people in it from post 1900, in the US/EU
4
Ṁ140Ṁ3k
resolved May 1
Resolved
NO

I want to pay someone to do this for me. If you do it and I end up with the file, then this resolves YES. I will pump a ton of mana into NO beforehand which means you can get a lot of cheap shares if you know it's possible.

Requirements:

  • must be a standard format (listed below)

  • must have at least 10m people in it who were born 1900 or later

  • it can be anonymised (i.e. no real names/no real birth locations) but I have to be able to at least distinguish who they are and their sex and date of birth, and who each person's biological parents were. Also we have to know meaningful things about the progeny - i.e. it can't be a random mix, it has to be scoped to a meaningful level i.e. "americans in a certain state" or a single EU country and time period" rather than a weirdly assembled block which may have irregularities

  • it has to be locally quite complete (i.e. a block of data about a population without randomly missing people in it; if you have the entire world's db and give me a random 1% of it, that's not good. But if you have complete data for a 10m person country in europe over a discrete time period, that IS good. So, location / source has to be scoped, and time period has to be limited. And it can't be an incoherent subcommunity (i.e. you can't give me 90% roman catholic paritioners, but no protestants, and 10% jailbirds, unless I have a way to distinguish)

  • I will not accept anything illegal or with PII of living people in it - so definitely don't include that

  • I don't need or want any exact identifying information. even exact birth dates can vary, I just need them accurate within say a month, and the order of kids has to be there, and i have to know if the kid is adopted. for marriages it would be good to know if someone died or had a spouse die.

I have to be able to get the file downloaded to my computer by the deadline. You can DM me and I will try hard. If it costs say 100$ that's okay too, I will pay, as long as I can get it.

Formats:

  • GEDCOM (.ged):

  • Ancestral File (.af):

  • Family Tree Maker (.ftm, .ftmb):

  • Legacy Family Tree (.fdb):

  • Personal Ancestral File (.paf):

  • Ancestry_Com Family Trees (.aft):

Deadline April 30 2024

Market context
Get
Ṁ1,000
to start trading!

🏅 Top traders

#TraderTotal profit
1Ṁ111
2Ṁ0
Sort by:

I've pumped a bunch more mana into NOs. If you get the db and help me access it, this will YES immediately and you will make a ton of mana.

When you say you want to pay someone - do you mean mana or real money?

@jim I will pay usd by venmo, and if you have such a DB and this market is sitting at 3%, there is mana to be made there, too.

@jim Ideally the point of this is to direct me to a place I can find such a db.

@Ernie I am not sure how you can do the things you are mentioning with properly deidentified data. Birthdates specifically are considered identifiable data, and combined with links to parental lineage or death info and you could pretty easily identify the entire dataset.

bought Ṁ750 NO

@BTE Is that even illegal? Big genealogy companies have all this data, as does the Mormon Church. I know an older genealogist whose personal on-HD gedcom file contains 57k people.

Normally you blank out the names of living people, but not their birth dates, but everyone who has died is public. Also, you can buy old census data after 70 years I believe.

@Ernie I didn’t say it was illegal. I said the fields you are asking for are inconsistent with a an anonymized data set. I think you can probably get the data you are looking for by buying a few different data sets and joining them. Idk I can think about it…

@BTE Ah i see, you're saying that if I had basic family structure for say 40 related people, I could search for that subgraph in the db and then realize that yes, it was them? That makes sense, at higher N uniqueness must set in pretty fast.

If the db had no names and no real birthdays, I wouldn't feel like I was being too invasive though. I would gain the ability to say that some random cousin of the people whose data I knew in real life had a mother who had say 5 other kids, which would be new info for me. But that's at least not 1st level clear PII.

@Ernie I have a lot of experience with decedent data and I think that would be the best source. No need to worry about identity then. You only want data for a some local population right? Or that is one of the requirements correct?