On any long running project of consequence, refactoring will happen. While usually this is a good thing, it can sometimes lead to the case where all references to a file are removed, but the file itself still sits in the codebase. Though not technically a problem in itself, this can cause a bit of unnecessarily bloat and confusion, not to mention unnecessary time spent updating these orphaned files to support future refactoring efforts. This is the spot that I found myself in while working with a Flex application recently. Unfortunately, Flex doesn’t natively provide a way to determine which files are required and which are not. Thankfully, with just a little bit of work, we can prevent further unnecessary cognitive and code clutter, by finding and removing these unreferenced files.

###Step 1 - Collecting the Linker Data### The first thing that we need to do is make the Flex compiler output some information that we can parse. If compilation is performed by the <mxmlc> ant task, you can just modify the call to include the link-report, e.g. <mxmlc … link-report=”my_report.xml”>. Otherwise, you can append -link-report mx_report.xml to the compiler arguments, as seen here.

Once you do this, every compile will write out an XML file that contains a few bits of useful information:

<script> - A file which was included into the binary by the compiler.
<def>    - A class provided within a given file.
<pre>    - A prerequisite for the defined class, such as a parent class or interface.
<dep>    - Any other class that our new class has a dependency on but is not based on.

For example, this is one possible snippet that the compiler could add to our link report:

<script name="C:\Users\CaptainJack\Source\DemoProject\src\com\example\demo\Frobnitz.as"
        mod="1338614991879"
        size="722"
        optimizedsize="325">
  <def id="com.example.demo:Frobnitz" />
  <pre id="Object" />
  <pre id="com.example.demo:Unobtainium" />
  <dep id="com.example.demo:Bar" />
</script>

After generating the link reports for all of the projects that you normally build, place the xml files in a new directory so that we can easily parse them. For the following sections, I assume that you have Ruby and the nokogiri gem installed on your system.

###Step 2 - Finding Unreferenced Classes###

Since other compiles can reference the contents of a library, there is no way compile-time way for compc1 to to tell that a class has never be used. Because of this, the list of files to compile into a library has to be specified manually rather than determined programatically. However, after our last compile has finished, by using the information in our link reports, we can derive a list of unneeded classes.

Logically, we can assume that an unused class will be a class that is <def>ined but that is never listed as a <dep>endency or <pre>requisite or any other class. Armed with this information, we can find all of the classes in our libraries that are never referenced. First, we parse all of the xml files for the three aforementioned tags, putting <def> in one array and putting <dep> and <pre> in another. Then we simply subtract the second array (required classes) from the first (defined classes). Whatever is left is defined but not required.

require 'nokogiri'

required_classes = []
provided_classes = []

Dir.foreach(ARGV[0]) do |filename|
  next if ['.', '..'].include? filename
  doc = Nokogiri::XML(File.new "#{ARGV[0]}/#{filename}")
  doc.search('def').each { |tag| provided_classes << tag['id'] }
  doc.search('pre').each { |tag| required_classes << tag['id'] }
  doc.search('dep').each { |tag| required_classes << tag['id'] }
end

unneeded_classes = (provided_classes - required_classes).uniq.sort

puts unneeded_classes

Run with application.rb path/to/xml

###Step 3 - Culling Files From SWFs###

Since nothing can compile against a swf, the mxmlc compiler is able to use simple dependency checks to determine what files need to be compiled. Though the working code is slightly longer, the logic in this script is actually more straightforward: If the compiler doesn’t compile it, it isn’t referenced. With that in mind, we simply get a list of all of the .as and .mxml files in the project directories and list the ones that are not referenced in <script> tag.2

require 'nokogiri'

available_files = []
required_files = []
working_dir = "#{ARGV[0]}/"

Dir.foreach(working_dir) do |filename|
  next if ['.', '..'].include? filename
  
  doc = Nokogiri::XML(File.new "#{working_dir}/#{filename}")
  doc.search('script').each do |tag|
    required_files << tag['name'].gsub('\\', '/').gsub(working_dir, '')
  end
end

src_directories = required_files
  .select { |file| file.match /\/src/ }
  .map { |file| file.gsub /(.*src).*/, '\1' }
  .uniq.sort

src_directories.each { |dir| available_files += Dir.glob "#{dir}/**/*.{as,mxml}" }
puts (available_files - required_files).uniq.sort.map { |file| file.gsub '/', '\\' }

Run with application.rb path/to/xml

###Results###

After running these two scripts against the Flex project I had been examining, I was able to identify roughly 3% of the files as being completely unreferenced.

  1. If Adobe follows their usual modus operandi, the supplied link will just quit working one day. It seems that nobody in Adobe’s employ is familiar with the term permalink

  2. Technically, our logic will output a list of all files that do not define a class or interface as well. Thus, any files that are included or used as a mix-in, but that do not declare a class or interface will be incorrectly listed as unused. As far as I know, the information to make a perfectly accurate list is not provided through the linker report.