#!/usr/bin/perl

=head1 NAME

graphcompare - A command-line tool to compare graph files in DOT or tabular format.

=head1 VERSION

v0.6.2

=head1 SYNOPSIS

    graphcompare  -input file1.dot,file2.dot \
                  -stats                     \
                  -colors HARD               \
                  -output output.dot         \
                  -table table.tbl           \
                  -venn venn.svg             \
                  -par parallelplot          \
                  -web graph.html

=head1 DESCRIPTION

This application compares two or more graph files (DOT or tabular). It prints a merged graph
with different colors for nodes and edges depending on the files in which they appear.
To read the files, graphcompare uses the module Dot::Parser or the module Tabgraph::Reader,
both located in lib/.

The main functionality of the script can be found at lib/Graphs/Compare.pm. This distribution
comes with a command-line tool (graphcompare) to compare the files.

By default, graphcompare will print the resulting graph to
STDOUT, but you can change it with the option --output (see options below).

graphcompare has some optional outputs, each one specified by one
option.

=over 8

=item B<Venn diagram>

If given the option -v, graphcompare will create an
svg file containing a venn diagram. In this image, you will be able to see
a comparison of the counts of nodes and relationships in each input DOT file,
and those nodes/relationships common to more than one file. The colors will be
chosen using one of the profiles in data/colors.txt. By default, the color palette
is set to be "SOFT". To change it, use the option -c (see options below).

=item B<Table>.

Complementary to the venn diagram, one can choose to create a
table containing all the counts (so it can be used to create other plots or tables). The
table is already formated to be used by R. Load it to a dataframe using:

        df <-read.table(file="yourtable.tbl", header=TRUE)

=item B<Webpage with the graph>.

With the option -w, one can create a webpage
with a representation of the merged graph (with different colors for nodes and
relationships depending on their presence in each DOT file). To make this representation,
graphcompare uses the Open Source library cytoscape.js. All the cytoscape.js code is
embedded in the html file to allow maximum portability: the webpage and the graph work
without any external file/script dependencies. This allows for an easy upload of the graph
to any website.

=back

=head1 INSTALLATION

    perl Makefile.PL
    make
    make test
    make install


It is important to note that if you decide to install graphcompare manually, the script needs to use File::Share to find
the templates. If you choose to not use the Makefile.PL installer, you may encounter some bugs, as graphcompare will be unable to open
the templates.

=head1 DIRECTORIES

These are the directories and the files inside the distribution:

=over 8

=item B<bin/>

This directory contains the main script: graphcompare.

=item B<lib/>

Here we can find the modules used by the program: Graphs::Compare, Dot::Parser, Dot::Writer,
Tabgraph::Reader, Tabgraph::Writer. The main functionality of the application is implemented in
Graphs::Compare. Dot::Parser is a Perl module that reads graphviz files. To see how it works, refer to its documentation:

    perldoc lib/Dot/Parser.pm

=item B<share/>

Here we can find the templates graphcompare uses to create the svg venn diagrams and the html output. We can also find
some test files, test1.dot, test2.dot and test3.dot to try out the program.

=item B<t/>

This is the directory with the test files and the script that runs the tests (parser.t).

=item B<Makefile.PL>

This is the script that uses ExtUtils::MakeMaker to create a Makefile to install the distribution.

=item B<MANIFEST>

List of all the files of the distribution.

=back



=head1 OPTIONS

=over 8

=item B<-h>, B<-help>

Shows this help.

=item B<-in>, B<-input> <file1,file2,...>

REQUIRED. Input files, separated by commas. Only DOT (graphviz) or TBL files.

=item B<-out>, B<-output> <filename.dot>

Saves the merged dot file to the specified file. Default to STDOUT.

=item B<-fmtin> FORMAT

Forces the program to read ALL the files as 'DOT' or 'TBL'. By default, graphcompare
looks at the extension of each file to choose one parser or another.

=item B<-fmtout> FORMAT

Changes the format of the output graph. By default it will use the DOT language.
As of now, you can change it to TBL.

=item B<-c>, B<-colors> <profile>

Color profile to use: SOFT (default), HARD, LARGE or CBLIND.

=item B<-ig>, B<-ignore-case>

Makes dotocompare case insensitive. By default, graphcompare is case sensitive.

=item B<-s>, B<-stats>

Prints to STDERR some graph properties for each DOT file. It can be time consuming if the
input graphs are very big.

=item B<-v>, B<-venn> <filename.svg>

Creates a venn diagram with the results.

=item B<-p>, B<-parallel> <basename>

Creates three parallel plots comparing the in/out/total degree of the graphs using the basename as filename.

=item B<-w>, B<-web> <filename.html>

Writes html file with the graph using cytoscape.js

=item B<-n>, B<-node-list> <filename.tbl>

Creates a file with a list of all the nodes and their appeareance on the input files. Each column
will represent one of the files, and they will contain a one if the node appears in it, or a zero
if it does not.

=back

=head1 BUGS AND PROBLEMS

=head2 Current Limitations

=over 8

=item I<Undirected_graphs>

Only works with directed graphs. If undirected,
graphcompare considers them to be directed.

=item I<Clusters>

Still no clusters support e.g. {A B C} -> D

=item I<Multiline_IDs>

No support for multiline IDs.

=item I<No_escaped_quotes>

No support for quotes in node IDs (even if properly escaped).


=item I<Compass_ports>

No support for compass ports.


=back


=head1 DEPENDENCIES

=over 8

=item File::ShareDir::Install

=item File::Share

=item Test::More

=item Pod::Usage

=item Cwd

=item Graph::Directed (only if using option -s)

=item AutoLoader (if comparing more than 5 files)

=item Color::Spectrum::Multi (if comparing more than 5 files)

=item Statistics::R (if option -p)

=item R modules: ggplot2, GGally (if option -p)


=back


=head1 AUTHOR

Sergio Castillo Lara - s.cast.lara@gmail.com


=head2 Reporting Bugs

Report Bugs at I<https://github.com/scastlara/graphcompare/issues> (still private)

=head1 COPYRIGHT


    graphcompare  command-line tool to compare graph files.
    Copyright (C) 2015  Sergio CASTILLO LARA

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.

=cut


#===============================================================================
# MODULES
#===============================================================================

use warnings;
use strict;
use Pod::Usage;
use Cwd 'abs_path';
use Graphs::Compare qw(compare_dots);
use Getopt::Long qw(:config no_ignore_case);


#===============================================================================
# VARIABLES AND OPTIONS
#===============================================================================
our $PROGRAM       = "graphcompare";
our $VERSION       = 'v0.6.1';
our $USER          = $ENV{ USER };
our $W_DIRECTORY   = $ENV{ PWD };
our $INSTALL_PATH  = get_installpath();


# If no arguments provided
pod2usage( -verbose => 0,
           -output  => \*STDOUT   ) unless @ARGV;

my %options = ();

GetOptions (
    \%options    ,
    'help|?'     ,
    "input=s"    ,
    "fmtin=s"    ,
    "colors=s"   ,
    "output=s"   ,
    "fmtout=s"   ,
    "table=s"    ,
    "venn=s"     ,
    "web=s"      ,
    "stats"      ,
    "node-list=s",
    "parallel=s" ,
    "ignore-case",
    "Test"       ,
    "debug"
 );


# If option --help
pod2usage( -verbose => 1,
           -output  => \*STDOUT   ) if $options{help};


my @files = split /,/, $options{input} if defined $options{input};

# If no files or too many files
if (@files == 0) {
    error("You have to introduce at least 1 DOT file \n\n\t" .
          'perl graphcompare -input file1,file2,file3...', 1
         );
} elsif (@files == 5 and $options{colors} ne "LARGE") {
    error("Too many files to use color palette $options{colors}\n".
          "Changing color palette to LARGE\n");
    $options{colors} = "LARGE";
} elsif (@files > 5) {
    error("Too many files to use the default color palettes.\n".
          "Will generate one using Color::Spectrum\n");
}

# DEFAULT OUTPUT FORMAT
if (not defined $options{fmtout}) {
    $options{fmtout} = "DOT";
} elsif ($options{fmtout} !~ m/^(DOT|TBL)$/i) {
    error("graphcompare can only write DOT or TBL files\n", 1);
}

# DEFAULT COLOR PALETTE
if (not defined $options{colors}) {
    $options{colors} = "SOFT";
}


#===============================================================================
# MAIN
#===============================================================================

# START REPORT
my $start_time   = time();
my $current_time = localtime();
print STDERR "\nPROGRAM STARTED\n",
             "\tProgram         $PROGRAM\n",
             "\tVersion         $VERSION\n",
             "\tUser            $USER\n",
             "\tWorking dir     $W_DIRECTORY\n",
             "\tColor Profile   $options{colors}\n",
             "\tInput files     ", join("\n\t\t\t", @files), "\n\n",
             "\tStart time      $current_time\n\n";
#--

# CHECK IF VENN DIAGRAM
if (@files > 3 and defined $options{venn}) {
    error("Too many files to draw a venn diagram.");
    $options{venn} = undef;
} elsif (@files == 1 and defined $options{venn}) {
    error("Only one file. Won't draw any venn diagram");
    $options{venn} = undef;
}

# RUN THE MODULE
compare_dots(\@files, \%options);


# END REPORT
my $end_time  = time();
$current_time = localtime();
my $sec = $end_time - $start_time;

my $hours = ($sec/3600) % 24;
my $minutes = ($sec/60) % 60;
my $seconds = $sec % 60;

my @out_files = grep {$_} ($options{output}, $options{table}, $options{venn}, $options{web}, $options{"node-list"});

if ($options{parallel}) {
    push @out_files, $options{parallel} . ".indegree.png", $options{parallel} . ".outdegree.png", $options{parallel} . ".totaldegree.png";
}

print STDERR "\nPROGRAM FINISHED\n",
             "\tOutput files \t", join("\n\t\t\t", @out_files), "\n\n",
             "\tEnd time \t$current_time\n\n",
             "\tJob took ~ $hours hours, $minutes minutes and $seconds seconds\n\n";
#--


#===============================================================================
# FUNCTIONS
#===============================================================================
#--------------------------------------------------------------------------------
sub get_installpath {
    my $path = abs_path($0);
    $path =~ s/(.+)\/.*?$/$1\//;
    return($path);
}


#--------------------------------------------------------------------------------
sub error {
    my $string = shift;
    my $fatal  = shift;
    my @lines = split /\n/, $string;

    my $error_msg = $fatal ? "FATAL" : "MINOR";
    print STDERR "\n# [$error_msg ERROR]\n";
    print STDERR "# $_\n" foreach (@lines);

    if ($fatal) {
        print STDERR "\n\n# Use graphcompare -h to get help.\n\n";
        exit(1);
    } else {
        print STDERR "\n\n";
    }

}
