Thursday, June 24, 2010

Python duplicate code detection with clonedigger

The python tool clonedigger can be used to examine your source code for duplication. It can be installed with easy_install clonedigger. Here is how to run it:
user1@deby:~/devenv/trunk$ ../bin/clonedigger src/
Parsing  src/greatings/helloworld.py ... done
Parsing  src/greatings/__init__.py ... done
Parsing  src/greatings/tests/__init__.py ... done
Parsing  src/greatings/tests/test_helloworld.py ... done
3 sequences
average sequence length: 2.666667
maximum sequence length: 3
Number of statements:  8
Calculating size for each statement... done
Building statement hash... done
Number of different hash values:  5
Building patterns... 6 patterns were discovered
Choosing pattern for each statement... done
Finding similar sequences of statements... 0  sequences were found
Refining candidates... 0 clones were found
Removing dominated clones... 0 clones were removed
Read more about clonedigger here.

Tuesday, June 22, 2010

Python code metrics with pymetrics

The python tool pymetrics can be used to measure your source code complexity. You can install it with easy_install pymetrics. Here is an example (running in virtual environment devenv; source code is located in src):
user1@deby:~/devenv/trunk$ ../bin/pymetrics src/greatings\
/helloworld.py

=== File: src/greatings/helloworld.py ===
Module src/greatings/helloworld.py is missing a module doc string.
Detected at line 1

Basic Metrics for module src/greatings/helloworld.py
----------------------------------------------------
          1    maxBlockDepth
          4    numBlocks
        189    numCharacters
          1    numDocStrings
          1    numFcnDocStrings
          2    numFunctions
          7    numKeywords
         16    numLines
         73    numTokens

         50.00 %FunctionsHavingDocStrings

Functions DocString present(+) or missing(-)
--------------------------------------------
- main
+ say

McCabe Complexity Metric for file src/greatings/helloworld.py
--------------------------------------------------------------
          2    __main__
          1    main
          1    say

COCOMO 2's SLOC Metric for src/greatings/helloworld.py
-------------------------------------------------------
          8    src/greatings/helloworld.py

*** Processed 1 module in run ***
Here is a script that let you generate report for all your files in src directory:
#!/bin/sh

working_dir=/tmp/$USER/pymetrics
mkdir -p $working_dir

find src/ -name \*.py > $working_dir/files.txt
../bin/pymetrics --nosql --nocsv -f $working_dir/files.txt
Here is a guideline for understanding cyclomatic complexity number:
  • 1 - 15: simple code, minimum risk, can be easily covered by tests
  • 15 - 30: complicated code, consider refactoring before writing tests
  • 30 - 50+: complex code, refactor now, almost impossible to write a good tests
Read more about software complexity here and here.

How to count source lines of code in Linux

You can use debian package sloccount for SLOC:
deby:~# apt-get install sloccount
  ...
user1@deby:~/devenv/trunk$ sloccount src/ | grep ^python
python:          31 (100.00%)
Read more about sloccount here.

Monday, June 21, 2010

Python static code analysis with pyflakes

Python tool pyflakes is focused on identifying common errors quickly without executing Python code. It can be installed with easy_install pyflakes. Since the tool doesn't import your code, you just specify a folder with source code:
user1@deby:~/devenv/trunk$ ../bin/pyflakes src/
Using pyflakes with IDE like PyDev make good sense since you see errors while you are typing.

Python static code analysis with pylint

Pylint is a python tool that checks if a module satisfies a coding standard. You can install pylint with easy_install (pylint has dependencies on logilab_common and logilab_astng, so consider download them as well in case of offline install).
easy_install pylint
Running checks is pretty easy:
user1@deby:~/devenv/trunk/src$ ../../bin/pylint greatings
No config file found, using default configuration
************* Module greatings
C:  1: Missing docstring
************* Module greatings.helloworld
C:  1: Missing docstring
C: 10:main: Missing docstring
************* Module greatings.tests
C:  1: Missing docstring
W:  4: Relative import 'test_helloworld', should be 
'greatings.tests.test_helloworld'
W:  7:suite: Redefining name 'suite' from outer scope (line 6)
C:  6:suite: Missing docstring
************* Module greatings.tests.test_helloworld
C:  1: Missing docstring
C:  4:HelloworldTestCase: Missing docstring
C:  6:HelloworldTestCase.test_say: Missing docstring
R:  6:HelloworldTestCase.test_say: Method could be a function
W: 12:suite: Redefining name 'suite' from outer scope (line 10)
C: 10:suite: Missing docstring
  ...
Global evaluation
-----------------
Your code has been rated at 5.94/10
The tool also produces a report that include:
  • Statistics by type
  • External dependencies
  • Duplication
  • Raw metrics
  • Messages by category
  • % errors / warnings by module
  • Messages
If you want to change the default behaviour, you can define options in pylintrc file (use --rcfile option to specify a location). Here is an example.
user1@deby:~/devenv/trunk$ wget -P tools/ http://www.\
logilab.org/cgi-bin/hgwebdir.cgi/pylint/raw-file\
/df8f34aa3dd2/examples/pylintrc
user1@deby:~/devenv/trunk$ cd src/
user1@deby:~/devenv/trunk/src$ ../../bin/pylint \
--rcfile=../tools/pylintrc greatings
Read more about pylint here.

Python static code analysis with pychecker

Python tool PyChecker is used to find typical programming errors in your source code. In order to install it you need download it and install manually (if you try install it using easy_install it will not work, unfortunately). We will assume the following directory structure (note, devenv is virtual environment, you can create it issuing virtualenv devenv).
~/devenv/
`-- trunk/
    |-- src/
    |   `-- greatings/
    |       |-- __init__.py
    |       |-- helloworld.py
    |       `-- tests/
    |           |-- __init__.py
    |           `-- test_helloworld.py
    `-- tools/
Let download pychecker into tools directory and proceed with installation:
user1@deby:~/devenv/trunk$ wget -P tools/ http://downloads.\
sourceforge.net/project/pychecker/pychecker/0.8.18/\
pychecker-0.8.18.tar.gz
...
user1@deby:~/devenv/trunk$  cd tools && tar zxf \
pychecker-0.8.18.tar.gz \
 && cd  pychecker-0.8.18
user1@deby:~/devenv/trunk/tools/pychecker-0.8.18$ \
../../../bin/python setup.py install && cd ../.. && \
rm -rf tools/pychecker-0.8.18
...
The next step you need to "fix" a bit a pychecker file installed into your virtual environment bin directory so the content look like this.
#!/bin/sh

site_packages_dir=$(dirname $0)/../lib/python2.6/site-packages
python $site_packages_dir/pychecker/checker.py "$@"
Since pychecker import all the files it is going to check, you need to start the tool in correct directory, in our case src:
user1@deby:~/devenv/trunk$ cd src/
user1@deby:~/devenv/trunk/src$ ../../bin/pychecker greatings/*.py
Processing module helloworld (greatings/helloworld.py)...
Processing module __init__ (greatings/__init__.py)...

Warnings...

None
Please note that pychecker doesn't drill into sub-packages, so in order to analyze tests issue the following command:
../../bin/pychecker greatings/tests/*.py
If you want to change the default behaviour, you can define options in pycheckrc file (use -F option to specify a location). Here is an example.
user1@deby:~/devenv/trunk$ wget -P tools/ http://pychecker.\
cvs.sourceforge.net/viewvc/pychecker/pychecker/pycheckrc
user1@deby:~/devenv/trunk$ cd src/
user1@deby:~/devenv/trunk/src$ ../../bin/pychecker -F \
../tools/pycheckrc greatings/*.py
Read more about pychecker here and here.