The DejaVU Framework -- hush 3.1
[.] Papers Tutorials Examples Manuals Interfaces Sources Packages Resources ?

Manual: CIA 1 "May 1995" "CIA User's Manual"


[.] [man] man1 man2 man3 man4 man5 man6 man7 man8 man9 manl mann ?

NAME

cia - C program database builder

SYNOPSIS


cia [ option ] ... file ...


slide: SYNOPSIS

DESCRIPTION


.IR Cia , the C Information Abstractor, extracts structure information from C programs and stores it in a relational database. The database can be accessed by .IR cql (1), .IR Daytona (1), and many relational database systems. Building a C program database using cia is similar to building an executable using .IR cc (1). The following example builds a database from three C source files:

  .ft CW
          \$ cia -c -DMAP f1.c
          \$ cia -c -I/usr/local/include f2.c
          \$ cia -c f3.c
          \$ cia f1.A f2.A f3.A
  .ft R
  
The preprocessing options -D and -I accepted by cc are also recognized by .IR cia , which performs C preprocessing using .IR libpp (1). Instead of creating .o files, cia creates an abstract database file that ends with .A for each .c file. These files are then linked to form a program database. .IR Make (1) or .IR nmake (1) is recommended for making sure that a program database is up to date after some source files are updated. With .IR nmake , simply invoke "nmake ciadb" to maintain a program database. Cia probes the first C compiler cc on the user's PATH and generates a .IR probe (1) file if it does not exist yet. The probe file stores information about the C compiler, including predefined macros, reserved words, and other dialect information. If a C program to be analyzed corresponds to a different C compiler, then the -D-X option or -I-D option (see below) must be used explicitly to specify the compiler name (e.g., -D-Xgcc) or the corresponding probe file (e.g., -I-Dgcc.probefile).

Cia extracts information from source files according to a conceptual model (see below). This model defines the entities, attributes, and relationships to be collected from C programs. "Entities and Attributes" Cia records information on the definitions, declarations, and usage of entities that can be referenced across function boundaries. These non-local entities include files, functions, variables, data types, (i.e., structs, unions, typedefs, and enums), and macros (i.e., #define's and #undef's). These entities are referred to as CIA entities in CIA documents. The following table shows the kinds of entities and their corresponding attributes defined in the CIA conceptual model. A '*' character indicates that the attribute exists for the entity kind. .KS attr. val func var macro type file .tc \(ru .tc Index id string * * * * * Name name string * * * * * Kind kind enum * * * * * Filename file string * * * * * Data type dtype string * * * Storage class sclass enum * * * * Beginning line bline int * * * * Header line hline int * Ending line eline int * * * * Def/Dec Flag def enum * * * * * Checksum chksum string * * * * Length length * CPP Length cpplen int * .KE There are three classes of attributes: string, enumerated, and integer. Legal values of the three enumerated attributes ("kind", "sclass", and "def") are described in the Database Format section. The attribute "hline" is the ending line of a function header (up to the last line of the argument declaration list). Only a function has an "hline" attribute. "Cpplen" is the length of a source file after C preprocessing. "Reference Relationships" Cia records reference relationships between CIA entities. A reference relationship exists between CIA entities A and B if one of the following conditions holds:

If a macro expands across definition or declaration boundaries, then references to this macro may or may not be recorded in the database. Such programming practice is not encouraged.

The following table shows all the reference relationships defined in the CIA conceptual model. .KS func var macro type file .tc \(ru .tc func * * * * var * * * * macro type * * * file * * .KE

"Database Formats" By default, the CIA database consists of two database files: entity.db and relationship.db. The following short aliases are used to represent entity kinds in these databases:

           f        file
           p        function
           v        variable
           m        macro
           t        type
           u        unknown
  
The aliases are used in the database to save disk sapce, but the full names must be presented to queries (see .IR ciaql (1)). The CIA interpretation of storage class, the "sclass" attribute, is different from that defined in most C text books. Its value is encoded in a way to represent storage class, scope, and sub-kind information about an entity. The "sclass" value "global" is reserved for non-static function and variable definitions only. All non-global and non-static function or variable declarations are considered "extern". This special encoding scheme simplifies the query processing of several CIA tools. The following aliases are used for the set of legal "sclass" values:
           g        global        (external definition)
           e        extern        (external declaration)
           s        static  (static definition or declaration)
          t        typedef (types defined through typedef)
          l        libsym  (library symbols)
           d        macdef  (#define macros)
           u        macudef (#undef macros)
           n        none    
  
The "def" attribute has three possible values:
           df        definition
           dc        declaration         
          ud        un-definition   (for #undef macros only)
  
A structure or union declaration is considered a definition in the CIA context for the purpose of reachability analysis (see .IR subsys (1). A set of cql hash indices is generated automatically after the program database is created. If you intend to use a different database system, such as .IR Dayview (1), or any commercial database system, you must create your own database indices. "Options:"
  .ft CW
          cia_proc file.A _cia.H
  .ft R
  
after each .A file is created. See .IR cia_proc (1) for more details. If the -c option (incremental abstraction) is not specified, then the -i option will cause _cia.H to be linked together with all specified .A files. For example, suppose various subsets of a1.h, a2.h, a3.h, and /usr/include/stdio.h are included by f1.c, f2.c, .. f9.c, and the file "flist" stores all these file names (see the "-F" option). If the following command is run,
  .ft CW
          \$ cia -i -c -F flist
  .ft R
  
then information about a1.h, a2.h, a3.h, and /usr/include/stdio.h will be kept in .BR _cia.H . Since _cia.H accumulates include file information over a period of time, you may want to use .IR cia_scrub (1) before you link all the .A files to make sure that _cia.H does not keep information on obsolete include files. For example,
  .ft CW
          \$ cia_scrub flist _cia.H
          \$ cia -i f1.A f2.A f3.A
  .ft R
  
Again, the -i flag causes _cia.H to be linked with f1.A, f2.A, and f3.A. Using the -i option helps save disk space for .A files; however, it should be exercised with caution if different conditional compilation flags are used for different .c files. Also, if you want to use .IR incl (1) to perform include-file analysis on a particular .A file, then see the -W option described below.
  .ft CW
          \$ cia -j dir1/f1.A dir2/f2.A f3.A
  .ft R
  
will cause "dir1/" to be prepended to all relative pathnames in f1.A and "dir2/" to those in f2.A during database linking. This option is not compatible with the -P option.
  .ft CW
          \$ cia -i -Wc,'incl -I' -c f1.c
  .ft R
  
This command causes the following command to be executed in sequence:
  .ft CW
          cia -c f1.c
          incl -I f1.A
          cia_proc f1.A _cia.H
  .ft R
  

slide: DESCRIPTION

FILES


  
  file.c                input C source file
  file.A                abstract database file created with "cia -c file.c"
  file.D                dumped partial abstract database file when errors occur
  file.h                input C header file
  file.H                abstract database file created with "cia -c file.h"
  entity.db        entity database file
  relationship.db    relationship database file
  entity.DB        hash index files for entity.db
  relationship.DB        hash index files for relationship.db
  target.db        virtual cia database created with "cia -o target"
  target.DB        hash index files for target.db
  /usr/include        standard directory for #include files
  \$INSTALLROOT/lib/ciao/ciao_cc.cql        cia schema in cql format
  \$INSTALLROOT/bin/ciacom                parser
  \$INSTALLROOT/lib/probe          probe directory, see pp(3).
  

slide: FILES

SEE ALSO

Yih-Farn Chen, .IR "Reverse Engineering" , In Chapter 6 of .IR "Practical Reusable UNIX Software" , edited by Balachander Krishnamurthy, John Wiley & Sons, New York, 1995. Christopher A. Rath and Yih-Farn Chen. .IR "A CIA Tutorial -- an Introduction to the AT&T C Information Abstractor", April 1993. cc(1), ciaql(1), db2A(1), pp(3), cpp(1), probe(1), incl(1), cql(1).
[.] Papers Tutorials Examples Manuals Interfaces Sources Packages Resources ?
Hush Online Technology
hush@cs.vu.nl
09/24/99