DVC

DVC ensures robustness of their popular machine-learning toolkit with DeepSource.

DVC, created by Iterative is an open-source dataset and machine learning version control system designed to track the complete evolution of ML models, making it easy to switch back and forth between experiments.

DeepSource is static code analysis for humans. Stop wasting your time setting up and maintaining CLI tools on CI, just use DeepSource.

Ruslan Kuprieiev, Senior Software Engineer

Challenge

Data scientists have to switch between numerous time intensive experiments until they get the algorithm right. They use DVC to streamline this iterative process so they can make the switch instantaneously. A single critical bug in the tool can wreck the progress made on building the model which is why shipping quality, reliable code is taken very seriously at DVC.

While DVC has been using existing static analysis tools, they were on the lookout for a better tool that can:

  • Flag elusive issues which go undetected easily otherwise or during manual code reviews
  • Report the most accurate issues based on the context of their code to avoid dealing with irrelevant issues
  • Blend with the existing development workflow so the team doesn't lose track of issues and can easily act on them

Solution

"The GitHub integration is flawless", said Ruslan. From signing in with GitHub to installing DeepSource, the configuration is straight forward. It took him a few clicks and less than ten minutes to start reviewing the code.

Discovering 'hard-to-spot' flaws in the source code

DeepSource Analyzers detect 600+ type of potential security flaws, bug risks, anti-patterns along with other trivial issues (style & syntax) for Python. After integrating DeepSource, in the first analysis itself, Ruslan discovered over 200 complex issues in the codebase.

In addition, there are some issues that reviewers know of and don't need to be fixed. To avoid flagging such issues, DVC used DeepSource's 'This violation is intentional' feature to tweak the analysis. It helped the team focus only on important warnings.

Quality checks within the workflow to keep track of issues easily

Issues found during reviews can slip through the cracks and end up in production if not logged properly, be it due to human error or lack of proper tools. Since DeepSource flags issues directly in the pull requests & commits alongside the CI/CD checks, keeping track of all the issues detected and acting on them, without missing any, has been a smooth sailing for the team.

Results

DVC has been reviewing all their PRs using DeepSource which helped them ship quality code to production with minimal errors.

Ship clean and secure code.