Logo

The Data Daily

Designing a Multi-Language Database

Designing a Multi-Language Database

Today we look at three best-practice database designs to store data in multiple languages and easily scale to new markets.

Reaching millions of users with an application is every developer's dream. Achieving this goal becomes easier if users from all over the world can use your application. Since not all users know English or your target language, you need to design your application to be multilingual. This would not be possible without a multi-language database.

Designing a multi-language database that can easily scale to new languages is difficult. This is why you should rely on best practices. Here, you will see why you need a multi-language database, three great multi-language database designs, and how to choose the best one for you.

Let’s now learn everything you need to know about multi-language database design.

You never know how much your project will grow, and your small app for your friends could become an international service used by millions of people. If you want your product to be able to scale internationally, you need to design it so that you can easily adapt it to different cultures and markets. That is what internationalization is about.

Building a database ready for internationalization means designing a database that can store multilingual data. In other words, the backend should be able to provide data in multiple languages. To do this, the backend should connect and retrieve this data from a multi-language database.

Notice that giving users the possibility to switch between several languages is a nice feature to have. This is especially useful for polyglots or non-native speakers. So, even if your project is small and targets a local market, you should consider a multi-language database. After all, you cannot know in advance how successful your project will be and which users will use it. Let’s now learn why choosing the right multi-language database design for your needs is crucial.

Changing a database is a very costly operation in terms of time and energy because it triggers a ripple effect. Modifying the structure of the database involves changing the backends connected to it. This may also require you to adapt the frontends that rely on those backends accordingly. As you can see, changing a database structure is not an operation without consequences. For this reason, you should design a database that can scale easily and you do not have to change frequently.

As you can imagine, there are several ways to design a multi-language database, and each solution has its pros and cons. Considering the importance of the database structure for an application, you have to design your multi-language database wisely. This is why you should not start from scratch, but rely on best practices.

Let’s now take a look at some multi-language designs based on best practices.

Let’s delve deeper into the pros and cons of three multi-language designs I personally used in my experience as a full-stack web developer working for startups all over the world.

In this approach, each field in a multilingual table has a number of columns equal to the number of languages supported by the database.

Specifically, this is what the column name template looks like:

In this approach, there is one row for each language. The key to identify a multi-language entity is the following composite primary key:

In this approach, each column of a multi-language table is an external key to a translation table. In other words, there is a translation table for each table that involve multi-language fields.

As you can see, every multi-language database design has the same amount of pros and cons. This means that there is no preferred approach in absolute terms. You have to choose carefully based on your requirements, needs, and goals. I have personally used all three approaches in different projects and learned the lessons I've shared.

The column approach is particularly useful when dealing with Big Data especially from the perspective of having a database with tables with several millions of rows when you simply cannot afford queries or duplicated content. Although it is not the most scalable solution, the column approach is the only viable design with large data.

On the other hand, the row-by-row approach is useful when the branches of your company branches operating in the local market have freedom of action. In this case, duplication of content can even be an advantage.

Finally, the tabular approach is the right design if you are looking for an elegant and scalable solution and do not mind the performance drawback. The table approach is best for projects where you know that the data will not grow too large.

Designing a database that can store multilingual data is essential to help your business scale internationally. Also, it enables non-native speakers to use your application. Changing a database is a complex and time-consuming operation with non-negligible consequences on the application architecture. This is why you should design your database as multi-language right from the start. Here, we have looked at three different approaches to designing a multilingual database, studying their pros and cons, and going into detail about which one is best for you.

Images Powered by Shutterstock