LSAModule.RewriteMatrix Class
Read in the terms.dat and matrix.txt files and create an LSAModuleObject. Serialize it.

Access: Public
Base Classes: Object
  Members Description  
    numberDocuments    
    numberTerms    
    numberFactors    
    matrixFile    
    termsFile    
    matrixReader    
    termsReader    
    indexedTermsList    
    indexedVectorsList    
    singularValues    
    termHash    
    RewriteMatrix    
    Populate Loads the terms into a list, loads vectors into a list, cross indexes lists and creates a term hash (word, {vector,scale} )

 
    LoadTerms Read terms.dat. Each line is triple word\tindex\tweight. Triples are stored as IndexedTerms in a list

 
    LoadMatrix Reads matrix.txt. There are 3 regions of vectors, term vectors, document vectors, and singular values. We don't care about document vectors. Term vectors are stored as IndexedVectors in a list

 
    Skip A silly method to skip past the document vectors we don't care about

 
    ProcessTerms Turn the vectors in the terms region of the file into IndexedVectors and store them in a list

 
    GetNextVector A maximum of six floats per line, sometimes less, depending on the number of dimensions. We're therefore cagey about reading lines.

 
    CreateTermHash Cross index the indexed lists of terms and vectors. Combine these two redundant data structures into a non-redundant term hash

 
    CreateLSASpace Populate the termHash with terms and vectors, then make a space. Note the added features for tracking author, creation date, and documents used in the space

 
    SerializeLSASpace The name of the file is automatically generated from the name and date of the space

 
    GetSourceFiles Given a directory name, returns a string of all the files in that directory, with file extensions and spaces between file names. Convenient when calling CreateLSASpace with lots of source files

 
    ImplementCustomILSACalculator Unwrap loops/etc and build a custom LSACalculator class for this particular space. Should be faster than the normal loops. Watch changing the namespaces or classnames of items in this assembly -- it may break the code generated here.