Will anyone get the JS function in the description down to below 100ms?
15
290Ṁ1051
resolved Oct 5
Resolved
NO

return str.normalize("NFD").replace(/\p{Diacritic}/gu, "").toLowerCase().trim().replace(/[- ,'".]/g, "");

Running this in a loop over all Manifold market titles and description takes about 1 second. I feel like there's gotta be a more efficient way to do this.

(Must be under 100ms running on my machine, single-threaded.)

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ92
2Ṁ17
3Ṁ6
4Ṁ5
5Ṁ1
Sort by:

Have you considered rewriting it in Rust :^)

predictedNO

@MingweiSamuel I can honestly say that I had not.

predictedYES

Please try this:

r = new RegExp("\\p{Diacritic}| |-|,|'|\"|\\.", "gu")
// compiling regexp once might save a couple ms

(str
    .normalize("NFD")
    .replace(r, "")
    .toLowerCase()
)
predictedNO

@MrLuke255 This seems to have made it around 6% faster.

return str.normalize("NFD")

.replace(/[\p{Diacritic}-'",. ]/gu, "")

.toLowerCase();

predictedNO

@NadiaMatsiuk Invalid character class

predictedYES

@IsaacKing screw GPT4

It did successfully identify that trim() was superfluous!

I'm pretty sure I can do this but I'm offended at the notion that I'd work for manifoldbucks.

Why not do this once, create a database, and update the database with new entries. That also minimizes your contact with the manifold API.

predictedNO

@MatthiasPortzel Because the result needs to be on the client, and sending over normalized versions of all the titles and descriptions would significantly increase loading times.

@IsaacKing Sounds like you’ve built it up very differently from how I would.

It shouldn’t significantly increase loading times if you gzip content before sending it to the client.

predictedNO

@MatthiasPortzel gzip isn't magic. The uncompressed market titles and descriptions are 23MB, which gzip can turn into 9.5MB. That will still take quite a while to load on a slower internet connection.

@IsaacKing The client shouldn’t have all titles, that’s your problem.

predictedNO

@Paul How do you suggest programming a search engine to return results instantly if it doesn't store all the data on the client?

@IsaacKing This is normally done by making a request in JavaScript to the server after every keystroke and then returning an updated list of results. This scales to an arbitrarily large number of items on the server, while giving the user nearly-instantaneous (~100ms) results. For example, hn.algolia.com.

It is overkill for some use cases, but it’s not significantly harder to architect (I’ve done it before with <100 lines of node.js on the server), and it would solve the performance problems you’re encountering.

predictedNO

@MatthiasPortzel Hmm, I don't want it to be super laggy on a slow internet connection. It may also result in overloading the server if a lot of people are using it.

@IsaacKing All of these points are valid, but what happens when the number of markets expands by 10x or 1000x? Will it be possible to keep it client side indefinitely?

predictedNO

@Paul Most browsers will allow the client to use several GB of RAM, which is enough for up to ~50x the total number of markets on Manifold so far, including closed ones. Past that point I agree I would need to switch to a new system, but I'd hope that by then Manifold will have made its own search function work properly.

predictedNO

@IsaacKing I don’t want my tab to take several GB of RAM nor have to use several GB of bandwidth to load the search page.

predictedNO

@Paul Sure, but most users likely won't care.

return str.replace(/[^[a-zA-Z]/, "").toLowerCase()

err drop the second [

predictedNO

@jfjurchen That deletes any character with a diacritical mark.

@IsaacKing Those characters are European they don't matter

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules