http://bjoernstechblog.rueffer.info/posts/latex/bash/2010/05/11/Deleting-unused-equation-numbers-from-LaTeX-code/
last updated on 25 May 2018

11 May 2010

Deleting unused equation numbers from LaTeX code

Suppose you’ve written a long LaTeX document containing many many numbered formulas, like for example a scientific paper. Assuming that you have used AUCTeX’ fantastic macro completion it may well be the case that many of the numbered equations that you have entered are never referenced. How to get rid of the unused equation numbers? Unfortunately, emacs itself (i.e., reftex) does not seem to offer a remedy. So I came up with a little bash script that finds all labels, all references, matches them up to see which ones are not actually used and then deletes those from your tex file (yes, it deletes all unused labels, not only unused equation labels). Moreover, it even inserts \nonumber commands in equations, so not only will the label definitions disappear, but also the line numbers.

Here’s the code. The script takes a TeX file as command line argument and outputs the modified TeX source at the standard output.

#!/bin/bash 

# Copyright (C) 2010 by Bjoern Rueffer, Time-stamp: <2010-05-11 16:19:49 bjoern>

if [ $# -ne 1 ]; then 
	cat <<EOF  
Usage: $0 filename.tex

This will output a cleaned version of filename.tex on the standard
output. Every \label{...} definition with an unused label is deleted.

Note: Lines with references to labels should NOT contain the symbol #. 
Otherwise bad things will happen...

EOF
	exit 1
fi 

# do a sanity check 
grep \# $1 |grep 'ref{.*}' && (echo Error: An input line containing a \ 
reference also contains the; echo character '#'. Currently this is \ 
not supported by this script.;) >&2 && exit 1

# isolate list of all defined labels
grep \\label\{.*\} $1 | sed -e 's/^.*\\label{\(.*\)}.*$/\1/' \ 
 | sort -u > /tmp/labels

# find all references to labels, sorry this is a bit ugly
grep ref\{.*\} $1 | sed -e '
s/\(.*ref{[^}]*}\).*/\1/
s/ref{[^}]*}/FIRSTMARKER&/
s/^.*FIRSTMARKER//
s/ref{\([^}]*\)}/#\1#/g
s/#//
s/#[^#]*#/ /g
s/#//
' | xargs -n 1 echo | sort -u > /tmp/references

# identify the labels that have not been referenced
grep -vf /tmp/references /tmp/labels > /tmp/unusedlabels

# now delete labels from tex file and insert \nonumber commands
# into equations where necessary
rm -f /tmp/scriptfile
cat /tmp/unusedlabels \ 
	|xargs -n 1 -I % echo /%/s/\\label{%}/\\nolabelhere /g >> /tmp/scriptfile
echo /\\begin{document}/i\\>> /tmp/scriptfile
echo '\\def\\nolabelhere{\\leavevmode\\ifmmode\\nonumber\\else\\fi}%' \ 
	>> /tmp/scriptfile
echo >> /tmp/scriptfile
sed -f /tmp/scriptfile $1 

# clean up
rm -f /tmp/scriptfile /tmp/references /tmp/unusedlabels /tmp/labels

Comments (through old commenting system)

Kalidoss, 24 June 2010:  With your script file, except deleting labels from tex file and inserting \nonumber commands into equations where necessary is working with me. Where have I gone wrong? I am interesting in finding and deleting \label{...} in $$..$$, \[ .. . \] and {eqnarray*} environments. How to achieve this behaviour?

Björn, 30 November 2010:  What the script does is essentially a selective search and replace. It is not keeping track of the type of environment where a label is defined or if that label even occurs in math-mode at all. What you seem to be asking is surely possible as well, but a different approach would be needed. I would use e.g. perl to go through the source and pick out the environments you are interested in and then delete the label definition. Regular expressions are your friend ;)

Edwin, 20 December 2010:  How to findout the equation labels comes in listed environment in TeX file. Any script for this?

Björn, 22 December 2010:  Not at the moment. But one way of writing such a script is to first find all labels (e.g. by using parts of the script above) and then checking for each of them if they appear in the type of environment that you are interested in.

Björn Rüffer — Copyright © 2009–2018 — bjoern.rueffer.info