Coding Practices
A. Best Coding Habits
- Keep Code Clean
- Design-
- ‘Don’t expose the internals
- Keep implementation details hidden
- Dispensables
- Remove dead code
- Avoid print statements
- Variables
- Variable names should reveal intent
- Functions
- Use functions to keep code ‘DRY’ (don’t repeat yourself)
- function should do one thing
- Design-
- Use functions to abstract away complexity
- By doing so we gain- readability, testability, reusability
- Smuggle out-of-the-jupyter notebooks asap
- Law of flat surfaces- any flat surface at home accumulates clutter. Jupiter notebooks are flat surface of the ML world 
- Apply test-driven development
- Write unit-tests
- Write functional-test to assert that the metrics of the mode are above our expected threshold
- Make small and frequent commits
B. Common Code Smells
- Dead code - does not affect the output, distracts the flow (eg. print statements)
- Exposed internals - abstract details in function
- Duplication - create a function
- Irrelevant variable names
- Magic numbers
C. Refactoring process
Pre-requisites
- Local dev environment is set-up
- Run notebook and ensure it works
- Make a copy of the notebook
- Remove print statements
- Read notebook and list code smells
- Convert notebook as python file
- Determine boundary of refactoring and ass characterisation test there
Refactoring steps
- Run tests in watch mode
- Identify a block go code that can be extracted into a pure function
- The refactoring cycle
- Write a test
- Create a python module and define a function
- Make the test pass
- In the notebook, replace original code block with the newly defined function
- Restart and run entire notebook
- Commit your changes
- Add functional test for ML model
- Celebrate :)
Reference
- https://www.thoughtworks.com/insights/blog/coding-habits-data-scientists
- Pep-8 style guide summary- https://tandysony.com/2018/02/14/pep-8.html
To-do
- Learn Unit-testing
- Learn Version-control