Cadu Ribeiro
Coffee addicted Software Engineer
Accepting Emojis on a Rails app with MySQL
- 3 minsIn the past weeks at work, we faced a problem in our application. A user tried to express himself with an emoji in the description field. We were not expecting it then you can imagine what happened. 💥 in production 😱.
We received an alert in the team’s chat.
Going through the error details, we can see that MySQL raised an error.
And looking at the request, we can see the emoji in the body attribute:
Why did this happen? Is not Emoji a unicode character supported by the UTF8? Yes, it is. But some of them uses 4-bytes to store their data, and if we look at the UTF8 charset support at MySQL’s oficial doc, we can see that it can only accept 3-bytes.
- A maximum of three bytes per multibyte character.
It’s a different approach from Postgres’ UTF8. In Postgres charset table we see that UTF8 can store up to 4-bytes, meaning that Postgres already accepts Emojis by default.
How to store Emojis on MySQL database and avoid the Incorrect String value error?
Let’s check the MySQL’s doc again. We can see that in charsets support there is one item called utf8mb4. It’s a UTF8 with 4-bytes support. We should use it instead the default 3-bytes only.
I created a simple scaffold application for demonstration. Let’s use the model Comment
with 2 properties (body and name).
If we try to save the comment with an emoji on the body, it will raise an exception.
Let’s generate a migration to convert this table and the columns to utf8mb4
and add the following content to the migration:
See that we set the column name
with VARCHAR(191)
. This is because MySQL’s max key length for index on InnoDB engine is 767 bytes. With utf8
(3-bytes), we can store VARCHAR with a maximum of 255 chars (255 chars * 3 bytes = 765 bytes), but with utf8mb4
we can store the maximum of 191 chars (191 chars * 4 bytes = 764 bytes). If you want to store more bytes on the index, please look at InnoDB large prefix.
We need to change the database.yml
to set the encoding to utf8mb4. So, open the config/database.yml
and change the line with
to
Restart the server and now we can save emoji in our comment 😎.
If you are creating a new project, I highly recommend to start with utf8mb4
to avoid these issues in the future and eliminate the necessity of a migration for all tables or simply use Postgres instead ♥️
Cheers 🍻