CPAN->GREP


1 to 25 of 833 distributions (8.02 seconds)
Lingua-NATools-v0.7.10/Build.PL
                      't/bin/*.o', 't/bin/*.exe',
                      't/bin/corpus', 't/bin/words',
                      'Lingua-NATools-*',
AMBS/Lingua-NATools-v0.7.10 65 more files »
uplug-main-0.3.8/Uplug-0.3.8/bin/uplug-convert
		'language' => 1,
		'corpus' => 1,
		);
TIEDEMANN/uplug-main-0.3.8 93 more files »
Lingua-Ogmios-0.011/etc/ogmios/nlpplatform-demo.rc
#     NO_STD_XML_OUTPUT = 2 # termlistoutput
#      NO_STD_XML_OUTPUT = 3 # HTML output (corpus and tagged terms)
#       NO_STD_XML_OUTPUT = 5 # Txt output
THHAMON/Lingua-Ogmios-0.011 34 more files »
Lingua-YaTeA-0.622/bin/yatea
yatea - Perl script for extracting terms from a corpus of texts and
providing a syntactic analysis in a head-modifier representation.
=item    I<file>               corpus of texts in Flemm or TreeTagger output format
YaTeA aims at extracting noun phrases that look like terms from a
corpus. It also provides their syntactic analysis in a head-modifier
format.
27 more matches »
THHAMON/Lingua-YaTeA-0.622 43 more files »
Algorithm-VSM-1.70/examples/calculate_precision_and_recall_for_LSA.pl
my $corpus_dir = "corpus";                     # This is the directory containing
                                               # the corpus
my $corpus_dir = "corpus";                     # This is the directory containing
                                               # the corpus
my $corpus_dir = "corpus";                     # This is the directory containing
                                               # the corpus
5 more matches »
AVIKAK/Algorithm-VSM-1.70 17 more files »
Alvis-TermTagger-0.82/bin/TermTagger-brat.pl
TermTagger.pl [options] corpus termlist selected_term_list lemmatised_corpus
TermTagger.pl [options] corpus termlist selected_term_list lemmatised_corpus
This script tags a corpus with terms and provide a output compatible with Brat (<http://brat.nlplab.org/>). Corpus (C<corpus>) is a file
with one sentence per line. Term list (C<termlist>) is a file
6 more matches »
THHAMON/Alvis-TermTagger-0.82 12 more files »
Lingua-Align-0.04/bin/add_english_treetags
#
# add tree tagger tags & lemmas to an english tigerXML corpus
#
TIEDEMANN/Lingua-Align-0.04 53 more files »
Text-Corpus-CNN-1.02/lib/Text/Corpus/CNN.pm
#12345678901234567890123456789012345678901234
#Make a corpus of CNN documents for research.
C<Text::Corpus::CNN> - Make a corpus of CNN documents for research.
  Log::Log4perl->easy_init ($INFO);
  my $corpusDirectory = File::Spec->catfile (getcwd(), 'corpus_cnn');
  my $corpus = Text::Corpus::CNN->new (corpusDirectory => $corpusDirectory);
65 more matches »
KUBINA/Text-Corpus-CNN-1.02 7 more files »
Text-Corpus-NewYorkTimes-1.01/lib/Text/Corpus/NewYorkTimes.pm
#12345678901234567890123456789012345678901234
#Interface to New York Times corpus.
C<Text::Corpus::NewYorkTimes> - Interface to New York Times corpus.
  Log::Log4perl->easy_init ($INFO);
  my $corpus = Text::Corpus::NewYorkTimes->new (fileList => $fileList, corpusDirectory => $corpusDirectory);
  dump $corpus->getTotalDocuments;
120 more matches »
KUBINA/Text-Corpus-NewYorkTimes-1.01 6 more files »
Text-Corpus-VoiceOfAmerica-1.03/lib/Text/Corpus/VoiceOfAmerica.pm
C<Text::Corpus::VoiceOfAmerica> - Make a corpus of VOA documents for research.
  Log::Log4perl->easy_init ($INFO);
  my $corpusDirectory = File::Spec->catfile (getcwd(), 'corpus_voa');
  my $corpus = Text::Corpus::VoiceOfAmerica->new (corpusDirectory => $corpusDirectory);
  Log::Log4perl->easy_init ($INFO);
  my $corpusDirectory = File::Spec->catfile (getcwd(), 'corpus_voa');
  my $corpus = Text::Corpus::VoiceOfAmerica->new (corpusDirectory => $corpusDirectory);
60 more matches »
KUBINA/Text-Corpus-VoiceOfAmerica-1.03 7 more files »
Dist-Zilla-6.008/dist.ini
[MetaNoIndex]
dir = corpus
dir = misc
parent  = 0 ; used by the AutoPrereq test corpus
RJBS/Dist-Zilla-6.008 44 more files »
DiaColloDB-0.11.002/Changes
	* added 'mi1' score function (raw PMI)
	  - not too useful without pre-filtered corpus: too sensitive to low-frequency outliers
	  - fix for e.g. author-profiles
	* allow ddc queries without primary targets (=1), for 'subcorpus comparison'
	* merged -r 15013:15014 diacollo-0.07.006+vsem into DDC.pm
	* merged -r 15013:15014 diacollo-0.07.006+vsem into DDC.pm
	  - fixes for pseudo-corpus comparison
MOOCOW/DiaColloDB-0.11.002 19 more files »
Alt-CWB-ambs-2.2.102.5/Changes
TODO:
  - implement tests for the new CWB::CQP interface, using the included VSS corpus
  - complete reorganisation of CWB/Perl modules into packages CWB (utility functions,
    corpus encoding, CQP interface) and CWB-CL (API for low-level corpus access);
    WebCqp functionality and
 packages CWB (utility functions,
    corpus encoding, CQP interface) and CWB-CL (API for low-level corpus access);
    WebCqp functionality and demo Web interface will be released as a separate package
AMBS/Alt-CWB-ambs-2.2.102.5 15 more files »
Text-Corpus-Inspec-1.00/lib/Text/Corpus/Inspec.pm
#12345678901234567890123456789012345678901234
#Interface to Inspec abstracts corpus.
C<Text::Corpus::Inspec> - Interface to Inspec abstracts corpus.
  Log::Log4perl->easy_init ($INFO);
  my $corpus = Text::Corpus::Inspec->new (corpusDirectory => $corpusDirectory);
  dump $corpus->getTotalDocuments;
70 more matches »
KUBINA/Text-Corpus-Inspec-1.00 6 more files »
Text-Corpus-Summaries-Wikipedia-0.22/lib/Text/Corpus/Summaries/Wikipedia.pm
  use Data::Dump qw(dump);
  my $corpus = Text::Corpus::Summaries::Wikipedia->new;
  $corpus->create;
  my $corpus = Text::Corpus::Summaries::Wikipedia->new;
  $corpus->create;
  dump $corpus->getListOfXmlFiles;
  $corpus->create;
  dump $corpus->getListOfXmlFiles;
46 more matches »
KUBINA/Text-Corpus-Summaries-Wikipedia-0.22 4 more files »
Lingua-Interset-2.052/Changes
atures. Thanks to Saša Rosen, who tries to
use DZ Interset together with a multi-language parallel corpus called
Intercorp, we also created a driver for the IPI PAN Polish corpus, which in
 a multi-language parallel corpus called
Intercorp, we also created a driver for the IPI PAN Polish corpus, which in
turn caused one systemic change: o-tags (those setting the other feature) can
References). Dan added a driver for the Czech tags of the Multext East
multilingual corpus.
ZEMAN/Lingua-Interset-2.052 66 more files »
DiaColloDB-WWW-0.01.010/dcdb-www-create.perl
	       'local.rc' => dist_file("DiaColloDB-WWW",'rc/local.rc'),
	       'dstar/corpus.ttk' => dist_file("DiaColloDB-WWW",'rc/corpus.ttk'),
	       'dstar/custom.ttk' => dist_file("DiaCo
_file("DiaColloDB-WWW",'rc/local.rc'),
	       'dstar/corpus.ttk' => dist_file("DiaColloDB-WWW",'rc/corpus.ttk'),
	       'dstar/custom.ttk' => dist_file("DiaColloDB-WWW",'rc/custom.ttk'),
	   'local-rc|local|lrc|l=s' => \$rcfiles{'local.rc'},
	   'corpus-ttk|corpus|c=s' => \$rcfiles{'dstar/corpus.ttk'},
	   'custom-ttk|custom|C=s'  => \$rcfiles{'dstar/custom.ttk'},
6 more matches »
MOOCOW/DiaColloDB-WWW-0.01.010 15 more files »
Unicode-Tussle-1.111/data/words.utf8
          	blood-bay [adj.] ← blood
bloodbeat             	 › blood-beat, -circulation, -clot, -corpuscle, -disease, -drop, -flow, -freezer, -gout, -mark, -spoor, -spot, -stream, -supply, -system, blood
ing, -monger, -offering, -seller, -wreaker, blood-curdling, -stirring, -stirringness ← blood
bloodcorpuscles       	 › blood-corpuscles, lymph-c, Malpighian corpuscles, splenic c, Pacinian c, c. of Vate
ler, -wreaker, blood-curdling, -stirring, -stirringness ← blood
bloodcorpuscles       	 › blood-corpuscles, lymph-c, Malpighian corpuscles, splenic c, Pacinian c, c. of Vater ← corpuscle
bloodcount   
105 more matches »
BDFOY/Unicode-Tussle-1.111
Lingua-BrillTagger-0.02/lib/Lingua/BrillTagger.xs
			     the training set  
			     When training on a very small corpus, better
			     performance might be obtained by setting this to
void
_load_into_corpus( self, word )
     SV   * self
KWILLIAMS/Lingua-BrillTagger-0.02 1 more file »
Alvis-NLPPlatform-0.6/bin/alvis-nlp-standalone
alvis-nlp-standalone - Perl script for linguistically annotating a corpus contained in a file
THHAMON/Alvis-NLPPlatform-0.6 6 more files »
Lingua-EN-Inflexion-0.000006/lib/Lingua/EN/Inflexion.pm
              # "7 formulas found"
              # "7 corpuses found"
              # "7 brothers found"
DCONWAY/Lingua-EN-Inflexion-0.000006 7 more files »
String-Sections-0.3.2/corpus/template/parse_filehandle.tpl
my $corpus;
my $parsefiles;
};
nofatals 'resolve corpus dir' => sub {
  $corpus = path($FindBin::Bin)->parent->parent->parent->child('corpus');
nofatals 'resolve corpus dir' => sub {
  $corpus = path($FindBin::Bin)->parent->parent->parent->child('corpus');
};
2 more matches »
KENTNL/String-Sections-0.3.2 16 more files »
App-Licensecheck-v3.0.28/Changes
 [ Test Suite ]
 - Use Test::Roo and library calls (not script) for devscripts corpus
   license coverage tests.
 [ Test Suite ]
 - Add devscripts test tied to devscripts corpus, converted from earlier
   shunit2 script.
JONASS/App-Licensecheck-v3.0.28 5 more files »
WordNet-Similarity-2.07/doc/config.pod
information content file containing the frequency of occurrence of every
WordNet concept in a large corpus. A number of utility programs are
included in this distribution that can be used to generate an inf
TPEDERSE/WordNet-Similarity-2.07 18 more files »
Text-Mining-0.08/Text-Mining/lib/Text/Mining/Parser.pm
To parse any document, you must first create a corpus object. 
(The corpus must have been created previously).
To parse any document, you must first create a corpus object. 
(The corpus must have been created previously).
    my $corpus   = Text::Mining::Corpus->new({ corpus_id => 1 });
10 more matches »
ROGERHALL/Text-Mining-0.08 2 more files »
Home · About