Put Your Kindle Vocabulary Into Anki Cards
When you use a Kindle and select a word to get the translation or definition, every word can be saved into a local kindle Vocabulary or Dictionary for further reference or to train your language skills, this feature is called vocabulary builder.
Out of curiosity, I recently opened this list to find out that I really looked up a lot of words and all of them were put into the database. I wanted to extract all these words, get the translation and put it into Anki Flashcards to go through them from time to time. Or not. Anyway, the Vocabulary is a Sqlite3 database, so it can easily be read and put through a service such as Google Translate or DeepL.
Anki, unfortunately, doesn’t provide any official web based API, so one needs to have Anki installed locally and use a plugin which exposes a local API on the system where it’s running. Not the best, but relatively easy to do. To make everything a bit more burdensome, the Kindle Vocabulary is also not being synced to Amazon and has to be pulled from the device manually via a USB transfer. So this is probably a thing I’ll do every 5 years.
The utility is written in Python and available on GitHub, you need to have Python installed and available on your system, it’ll also install packages to make web requests and includes the SDK for DeepL.com translation and Google Translate.
You can install it by cloning the repository and installing the required packages:
git clone https://github.com/pew/kindle-vocab.git
pip install -U -r requirements.txt
Get the Vocabulary and Translate Entries
To get the vocabulary, you have to connect your Kindle to a computer and locate the vocab.db file, this is stored on the Kindle directly in the system/vocabulary folder. If you connect the Kindle to a computer, it should be mounted like a USB thumb drive. Just copy the file to your computer somewhere, or ideally into the cloned repository from the previous step.
Translate All the Things
I recommend getting an API Key from DeepL, since it’s an official integration and they have a generous free tier (500k characters per month for free) and usually very good results. Google Translate is a wrapper around the website and could break at any time. I’m using DeepL as an example here.
Given you have the package installed, just export your API Key as an environment variable and run the application like so:
python application.py -e deepl -l de -i vocab.db -o dictionary.db
This will pick DeepL as the translation engine (-e), translates everything to German (-l), uses vocab.db as a the source vocabulary (the Kindle one) with -i and writes everything into dictionary.db using the -o flag.
You can also just run python application.py -h to get a help menu with some examples.
This will read through the Kindle Vocabulary file and calls the DeepL API for all words and saves them into dictionary.db. All duplicate entries will be skipped. The dictionary.db file is more or less just two columns with the original and translated word:
Next, to import everything into Anki, install the AnkiConnect plugin / add-on to expose a REST API to communicate with the Anki application. Make sure to keep Anki open while executing the next commands, otherwise the API is unavailable.
Create a new Anki deck with the name kindle (if you want to change it, update the ankiconnect.py file from the repository) and just execute it:
This reads the dictionary.db file and submits each entry to the Anki Deck. Now the hard part: Actually open up Anki and learn the words. 🤷♂️
The AnkiConnect tool should be integrated into the main application and another column could be added to the database schema to skip already added entries, both for translation and also submitting them back to Anki (AnkiConnect will ignore duplicates and throws them away. However, it would make things way faster).
Also, it would be great not just to have a simple translation of a word, but also a definition of the word within a sentence available. The Oxford Dictionaries provide an API endpoint as well, although with a bit more limited free tier.