How can I avoid duplicates in table

Hi,

I have like millions of text lines to ingest to the database but there are duplicate lines among all the lines. How can I avoid inserting duplicated lines into the DB? For example, in elastic I’m able to overwrite the “_id” attribute when ingesting the data but here Im unable to do that. Any idea?

There is no possibility to set any column as primary key as well afaik.

Thanks

There is no possibility to set any column as primary key as well afaik.

That’s correct — you need to use the id if you want to deduplicate. You can use any hash function that returns a bigint for that purpose.

The related issue is Feature request: user defined constraints (unique key) · Issue #1015 · manticoresoftware/manticoresearch · GitHub

1 Like