Open Source
txtai is open-source, which allows users to freely access, modify, and distribute the code, fostering collaboration and innovation within the community.
Ease of Use
The library provides a simple API that makes it easy to integrate into existing projects, making it accessible for users with varying levels of technical expertise.
Versatile Functionality
txtai supports a wide range of NLP tasks including embeddings, search, question-answering, and translation, providing users with a comprehensive suite of tools.
Scalability
Designed to handle large datasets efficiently, txtai can scale its operations to suit both small projects and enterprise-level applications.
Active Development
The project is actively maintained and regularly updated, ensuring compatibility with the latest advancements in NLP technology.
Promote txtai. You can add any of these badges on your website.
Txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows. - Source: dev.to / 4 months ago
Ideal For: Projects requiring quick setup and robust search capabilities. GitHub Repository. - Source: dev.to / 4 months ago
Excellent project. As mentioned in another comment, I've put together an embeddings database using the arxiv dataset (https://huggingface.co/NeuML/txtai-arxiv) recently. For those interested in the literature search space, a couple other projects I've worked on that may be of interest. Annotateai (https://github.com/neuml/annotateai) - Semantic search and workflows for medical/scientific papers. Built on txtai... - Source: Hacker News / 4 months ago
If you're looking for a lightweight open-source framework designed to handle the patterns mentioned in this article: https://github.com/neuml/txtai Disclaimer: I'm the author of the framework. - Source: Hacker News / 4 months ago
I fully agree. Postgres has solved many of the problems that many are re-solving with GenAI related databases. With txtai (https://github.com/neuml/txtai), I've went all in with Postgres + pgvector. Projects can start small with a SQLite backend then switch the persistence to Postgres. With this, you get all the years of battle-tested production experience from Postgres... - Source: Hacker News / 5 months ago
Txtai (https://github.com/neuml/txtai) has had inline vectorization since 2020. It supports Transformers, llama.cpp and LLM API services. It also has inline integration with LLM models and a built-in RAG pipeline. - Source: Hacker News / 5 months ago
Txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows. - Source: dev.to / 5 months ago
Excellent article on BM25! Author of txtai[1] here. Txtai implements a performant BM25 index in Python [2] via the arrays package and storing the term frequency vectors in SQLite. With txtai, the hybrid index approach [3] supports both convex combination when BM25 scores are normalized and reciprocal rank fusion (RRF) when they aren't [4]. [1] https://github.com/neuml/txtai [3]... - Source: Hacker News / 5 months ago
Author of txtai (https://github.com/neuml/txtai) here. I've been in the embeddings space since 2020 before the world of LLMs/GenAI. In principle, I agree with much of the sentiment here. Embeddings can get you pretty far. If the goal is to find information and citations/links, you can accomplish most of that with a simple embeddings/vector search. GenAI does have an... - Source: Hacker News / 6 months ago
In a crowded "AI space", I continue to work on txtai (https://github.com/neuml/txtai) for semantic search, LLM orchestration and language model workflows. It's not as popular as the big frameworks but I believe it's a better solution. Time will tell. - Source: Hacker News / 6 months ago
Txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows. - Source: dev.to / 8 months ago
I'm always cautious with a one-size-fits-all approach. If a team is working on a small project and SQLite works then great. You can use a SQLite database on something like a $4/month DigitalOcean droplet. Can't say the same for Postgres. > AI is a bubble Many say this but Generative AI and LLMs have gotten bunched up with everything else. There is a clear need for vectors and multimodal search. There is no core... - Source: Hacker News / 9 months ago
Hello, author of txtai here. Txtai was created back in 2020 starting with semantic search of medical literature. It has since grown into a framework for vector search, retrieval augmented generation (RAG) and large language model (LLM) orchestration. The goal of txtai is to be simple and easy-to-use. Semantic Graphs were added in 2022 before the Generative AI wave of 2023/2024. Knowledge Graphs are now a hot... - Source: Hacker News / 10 months ago
Txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows. - Source: dev.to / 11 months ago
I tend to agree with this sentiment. Many junior devs and/or those in college want to contribute. Then they feel entitled to merge a PR that they worked hard on often without guidance. I'm all for working with people but projects have standards and not all ideas make sense. In many cases, especially with commercial open source, the project is the base of a companies identity. So it's not just for drive-by ideas to... - Source: Hacker News / about 1 year ago
Bootstrapping only works if you have the runway to do it and you don't feel the need to grow fast. With NeuML (https://neuml.com), I've went the bootstrapping route. I've been able to build a fairly successful open source project (txtai 6K stars https://github.com/neuml/txtai) and a revenue positive company. It's a "live within your means" strategy. VC funding can have... - Source: Hacker News / about 1 year ago
I agree that in many cases people are puffing their feathers to try to be something they're not (at least not yet). Some believe in the fake it until you make it mentality. With NeuML (https://neuml.com), the website is a simple HTML page. On social media, I'm honest about what NeuML is, that I'm in my 40s with a family and not striving to be the next Steve Jobs. I've been able to build a fairly successful open... - Source: Hacker News / over 1 year ago
I'll add txtai (https://github.com/neuml/txtai) to the list. There is still plenty of room for innovation in this space. Just need to focus on the right projects that are innovating and not the ones (re)working on problems solved in 2020/2021. - Source: Hacker News / over 1 year ago
Nice project! I've long used Tika for document parsing given it's maturity and wide number of formats supported. The XHTML output helps with chunking documents for RAG. Here's a couple examples: - https://neuml.hashnode.dev/build-rag-pipelines-with-txtai - https://neuml.hashnode.dev/extract-text-from-documents Disclaimer: I'm the primary author of txtai ( - Source: Hacker News / over 1 year ago
Txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows. - Source: dev.to / over 1 year ago
If you're interested in graphs + RAG and want an alternate approach, txtai has a semantic graph component. https://neuml.hashnode.dev/introducing-the-semantic-graph https://github.com/neuml/txtai Disclaimer: I'm the primary author of txtai. - Source: Hacker News / over 1 year ago
Do you know an article comparing txtai to other products?
Suggest a link to a post with product alternatives.
This is an informative page about txtai. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.