Migration from Python 2 to 3: Strategies, Tools and More
In part-1 of this blog series, we discussed about what makes Python migration from version 2 to 3 so important for the embedded applications. We also touched upon the kind of changes one would expect when migrating from Python 2 to 3. We highly recommend that you go through the first part of the Python migration blog before reading its sequel.
In part-2 of the blog series, our focus will be on the migration strategy and the various tools that aid us in the smooth transition from older version of Python to the all-new version 3.
Manual or Automated: Which Way to Go for Python Migration?
Migrating to the latest version of Python needs you to make a choice between the two strategies- Manual and automated. One of the major drivers of this choice is a clear understanding of the project’s status in terms of size, complexity, type of application and so on.
For instance, if you plan to migrate an application to Python 3, it is only the top-level scripts that need to be reworked. Since very few internal modules depend on an application, they do not need to be changed.
On the contrary, if a framework is to be migrated, a number of plug-ins and applications depend on it. A small technical change in the framework will impact several applications and modules. Therefore, based on whether your embedded application is a provider of libraries or a consumer, your migration strategy would differ.
Another major factor to be taken into account while choosing the migration strategy is your user-base and the revenue model of your Python-based software. If the software is purely commercial, you may want to speed up the migration process or prioritize certain modules for migration. In case your application is an open-source project used within a team under an enterprise, the migration strategy could be more relaxed.
Once you are clear about the nature of your applications, you have two options to choose from. One is the complete re-write of the code from scratch, and the other is using an automated tool for the purpose.
At times, complete automation of Python migration is not feasible, and some manual code re-writing becomes necessary in order for the application to perform the assigned task correctly. Also, if the code base is not too large, a manual migration can be done to save the tool cost.
Steps Involved in Migration from Python 2 to 3
Whether you are going the manual or automated way of Python migration, there are certain toolboxes and frameworks you will require. The most obvious ones are the older Python version 2.X and newer version to which you wish to migrate, i.e. Python 3.X. A few frameworks will also need to be readied based on the user community you are catering to.
Some basic tasks to be performed for Python migration from 2 to 3:
- Libraries/frameworks to support the older versions of Python 3
- Compatibility specifications to be spelled out in the project’s readme file
- Virtual environment manager to be installed to manage the multiple Python versions installed
- A Python code analysis tool such as Pylint must be kept handy as it helps identify the syntax changes required for the migration to be done the right way
- Another Python code testing framework called Pytest is required for code coverage post migration
Once the tool-boxes are ready, we can move forward with one of the strategies- manual code re-writing.
Manual Method for Python 2 to 3 Migration
The manual code re-write strategy is all about creating a new version of the software that conforms to Python 3. You bid goodbye to the older version and its legacy code, however the old code still acts as a reference and also as a source of test cases for unit testing.
The first step in writing the new code is to migrate the old unit test cases to the Python 3 environment. When these tests are run, based on their failure, or passing, new code is written. The codes that fail the unit test need to be re-written. The ones that pass can be preserved.
Any kind of latent bugs found in the old code should be fixed because they might not manifest in a friendly manner in the Python 3 environment as they did in the older version.
At times, some new unit tests might also need to be re-written for Python 3. This is due to the fact that the newer Python version has built-in mock libraries that simplify the unit tests. Also, the tests must be in sync with the new software architecture. However, if you have simpler unit tests that do not rely on Python 3 unique features, they can be left unchanged.
Now that the unit tests have run, it is time to manually write the code based on the unit test reports. As the new code follows the test result, developers refer to this approach as test-driven programming. The code is subject to re-iterative testing, i.e., the new code is tested against the test cases. The failure is analysed, and changes are made to the code. Ideally, every iteration of unit test reduces the number of failures and brings the code nearer to perfection.
Migration from Python 2 to 3, the Automated Way
When you have a huge code base, it is practically impossible to re-write each line of code and test it in an iterative manner. The time, cost and effort involved would be enormous. For such projects, automated migration is recommended.
And when there is automation, there are tools that make this possible. Likewise, for automated migration to Python 3, we have a tool called 2to3, a pretty straightforward name.
Automated migration from Python 2 to 3 is a 6-step process. Let’s examine each of them in some detail:
- Creating Unit test cases: This is one of the most crucial steps in the process of migration as this is paramount in creating the code that works. However, this can get challenging at times. There are instances where the legacy code does not have unit tests designed or they do not cover certain migration issues like data type conversions. Even worse, they could have been written using syntax that has now changed completely. In such circumstances, tools like 2to3 and Six are quite useful.
- Resolving issues due to syntax changes: This is where the real migration starts. The issues arising from syntax changes are addressed in this step. Pylint tool helps in this process by highlighting the problem areas of Python 3 version.
- Running the test cases in both Python 2 and 3 environments: Unit tests are run on the legacy code. Usually in the first iteration, all unit tests fail. The trick is to have a system to repeat the unit testing until you get right.
- Executing the migration using 2to3 tool: Based on the failure report of the unit test, 2to3 tool will perform the migration of the available code. The migrated code should work fine, however, in some instances, some syntax adjustments and reworking the unit test cases might be required. If there are failures, re-run of the unit tests and repeated migrations are required until the code fulfills the intended purpose.
- Fix the bugs and re-test: Even after the code is fully migrated to Python 3, the work is only half done! The code will now be tested in Python 3 environment. You might need to adjust the Python 2 code and re-run the migration. Or you would have moved past that stage and adjustments in Python 3 code would suffice. So, these minor adjustments will have to continue until complete migration is done and code is 100% workable and bug-free.
- Optimization after the migration: An automated conversion sometimes introduces certain unwanted elements to the code. They can be anything from extra float function to a list function. These elements need to be cleaned up. And to be extra careful, re-run of tests needs to be performed after every instance of cleanup/optimization.
Manual or automated, you need to get Python code migrated from version 2 or 3. For the embedded system applications, whether it is automotive or others, the migration becomes all the more important. Embitel has been working on such migration projects, especially since the support to Python 2 has ended. If you have queries related to Python migration for embedded applications, we are there for you!