Thoughts about ArangoDB (the database)

About multi-model databases

June 04, 2024

database

Some things I know about ArangoDB (the database)

  • It's a multi-model database That just means something which can store multiple data models – like an OTP which will only be stored for 30s.
  • This is not the same as a multi-modal database (which is storing multi-media etc.)
  • There are very few real alternatives which are production ready.

I'm still learning the implications of being a multi-model database. Does it mean some things become easy? For e.g. the OTP stuff can be done in-memory by using code, but having a data model that supports it might be much better? I need to understand why people choose ArangoDB.

Things I have found

  • This master's thesis comparing Arango/Mongo/Neo4j: https://pdfs.semanticscholar.org/ef94/b6c0e60965432bd8fdfe2456c7581cd2c7c3.pdf

    Quoting from here

    "ArangoDB is the name of a database system under a free and open source license, created in 2011. Initially, it was published under the name AvocadoDB, which faced legal problems, and a year later it got its current name. Proposed by ArangoDB. The database system is one of the most universal because it supports as many as three data models: key-value, documents and graphs."

    "The motivation for this direction of development was the combination of models already offered by NoSQL databases in one solution. Neo4j uses graphs, MongoDB uses documents, and the creators of ArangoDB decided to create a base with a multi-model approach to overcome the need to use different solutions for different types of data. This relatively new database system has many clients, incl. this is Cisco or Thomson Reuters, which is one of the largest message delivery companies. It uses ArangoDB to build its platform for internal information exchange, analysis and intelligence management. The AQL query language in particular helps here."

    "The motivation for this direction of development was the combination of models already offered by NoSQL databases in one solution. Neo4j uses graphs, MongoDB uses documents, and the creators of ArangoDB decided to create a base with a multi-model approach to overcome the need to use different solutions for different types of data. This relatively new database system has many clients, incl. this is Cisco and Thomson Reuters, one of the world's largest news delivery companies. It uses ArangoDB to build its platform for internal information exchange, analysis and intelligence management, AQL query language helps in such tasks"

    "Despite its young age, ArangoDB has been equipped with many functionalities that competitors have. Providing, among others, vertical scaling - that is, the use of more and more efficient servers, and horizontal scaling, i.e. the creation of clusters from many servers. Like MongoDB and Neo4j, it uses many servers, each of which has some data and depending on it can be a leader (parent server) or a slave server. The entire system also uses coordinators who using the developed data generator act as an intermediary between the ArangoDB cluster and clients, thanks to which they perform all or part of the query depending on the data they have. Such a structure will ensure easy scalability by adding new servers to the pool of machines. The model used is also an important aspect of scalability. Different models scale better vertically or horizontally"

    I assume for the last highlighted part, the models being referred to are: "key-value, documents and graphs."

  • This comment from the community slack - "Right now ArangoDB is out of competition. They're only production grade multimodel database"

Some direct competitors that I found

  • OrientDB - this claims to be the "first multi-model db", with reviews such as "OrientDB solves problems that other software can not solve"
  • SurrealDB - founded 2021, London based, raised $18M, with reviews such as "SurrealDB has embedded JS functions and live querying on top of many things - feels like the React of DB's" - but not ready for production.