Message from discussion
translated to Ruby
Received: by 10.90.113.17 with SMTP id l17mr5609809agc.14.1231353998491;
Wed, 07 Jan 2009 10:46:38 -0800 (PST)
Return-Path: <kapel...@gmail.com>
Received: from rn-out-0910.google.com (rn-out-0910.google.com [64.233.170.190])
by mx.google.com with ESMTP id 39si11806360yxd.15.2009.01.07.10.46.37;
Wed, 07 Jan 2009 10:46:37 -0800 (PST)
Received-SPF: pass (google.com: domain of kapel...@gmail.com designates 64.233.170.190 as permitted sender) client-ip=64.233.170.190;
Authentication-Results: mx.google.com; spf=pass (google.com: domain of kapel...@gmail.com designates 64.233.170.190 as permitted sender) smtp.mail=kapel...@gmail.com; dkim=pass (test mode) header...@gmail.com
Received: by rn-out-0910.google.com with SMTP id k40so5786438rnd.0
for <php-text-statistics@googlegroups.com>; Wed, 07 Jan 2009 10:46:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=gamma;
h=domainkey-signature:received:received:message-id:date:from:to
:subject:in-reply-to:mime-version:content-type:references;
bh=o5oYEA5le56IOrE7m69B2mei0PYk/zdYlpzSCcjjDys=;
b=kLvWu8YN8UEndbJ73xZBjdrZf1N4qwllbu80O5eaGXxS5UEvtjySLP/TzscfZSRnaX
mXmH3IBZAczVnajP+R+Wpj0v4hvQHyN5QYdIT1kdFXwPgHxWZHpcEt7lSuUH8WZ4dhcs
0TPYQrBQQXYkSD9Z/E8/j+Xp0105i3+L2VFMk=
DomainKey-Signature: a=rsa-sha1; c=nofws;
d=gmail.com; s=gamma;
h=message-id:date:from:to:subject:in-reply-to:mime-version
:content-type:references;
b=GH0nL9c15WfNPAhuJFmEQn9fJ6d6K1EjhJebFqy3ypNsRSR2yKjDnfHMtjlG7wpGiM
73+F0/JuC3eKof9ks7NQaUSEIWm4KgubFlYm0zoK8KNNqPtS3yw6F9D5JGREpY2PaZ3r
J1qkwfsC1aaXfqDpF0jgHOgUo95+5dNcS9erY=
Received: by 10.150.98.4 with SMTP id v4mr380565ybb.137.1231353997336;
Wed, 07 Jan 2009 10:46:37 -0800 (PST)
Received: by 10.151.73.11 with HTTP; Wed, 7 Jan 2009 10:46:37 -0800 (PST)
Message-ID: <b2baff6b0901071046y24dce000t159817b43c0a668a@mail.gmail.com>
Date: Wed, 7 Jan 2009 10:46:37 -0800
From: "Adam Kapelner" <kapel...@gmail.com>
To: php-text-statistics@googlegroups.com
Subject: Re: translated to Ruby
In-Reply-To: <e8c787520901070202wc4731edwb9951bf004fec...@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_Part_245913_5335230.1231353997326"
References: <45ac2e73-1edf-49a2-b265-925370aa8...@z27g2000prd.googlegroups.com>
<e8c787520901070202wc4731edwb9951bf004fec...@mail.gmail.com>
------=_Part_245913_5335230.1231353997326
Content-Type: text/plain; charset=ISO-8859-1
My pleasure.
I couldn't figure out how to translate that one line in the clean text
function:
#$strText = preg_replace_callback('/\. [^ ]+/', create_function
('$matches', 'return strtolower($matches[0]);'), $strText); // Lower
case all words following terminators (for gunning fog score)
So I guess that's the reason why my Gunning-Fog is off a bit.
Would you mind adding these two files to the repository? I won't have any
time for the next two months.
Thanks!
Adam
On Wed, Jan 7, 2009 at 2:02 AM, David Child <d...@addedbytes.com> wrote:
>
> Great work Adam!
>
> On Tue, Jan 6, 2009 at 11:16 PM, way4thesub <kapel...@gmail.com> wrote:
> >
> > Hello all,
> >
> > I've translated the php-text-statistics package to Ruby, you can view
> > the files below. Please note I couldn't get the Gunning Fog Score to
> > work 100%
> >
> > Regards,
> > Adam
> >
> >
> > ############### Code
> > require 'collections/sequenced_hash'
> >
> > module ReadabilityIndices
> >
> > class Readability
> >
> > NumDecimalPlaces = 1
> >
> > Titles = SequencedHash.new
> > Titles[:flesch_kincaid_grade_level] = 'Flesch-Kincaid Grade level'
> > Titles[:flesch_kincaid_reading_ease] = 'Flesch-Kincaid Reading
> > Ease'
> > Titles[:gunning_fog_score] = 'Gunning-Fog score'
> > Titles[:coleman_liau_index] = 'Coleman-Liau Index'
> > Titles[:smog_index] = 'SMOG Index'
> > Titles[:automated_readability_index] = 'Automated Readability
> > Index'
> >
> > attr_accessor :text
> > def initialize(text = '')
> > self.text = clean_text(text)
> > end
> >
> > def valid_index?(index)
> > Titles[index] ? true : false
> > end
> >
> > def flesch_kincaid_grade_level
> > round(0.39 * average_words_per_sentence + 11.8 *
> > average_syllables_per_word - 15.59, NumDecimalPlaces)
> > end
> >
> > def flesch_kincaid_reading_ease
> > round(206.835 - 1.015 * average_words_per_sentence - 84.6 *
> > average_syllables_per_word, NumDecimalPlaces)
> > end
> >
> > def gunning_fog_score
> > round((average_words_per_sentence +
> > percentage_words_with_three_syllables(false)) * 0.4, NumDecimalPlaces)
> > end
> >
> > def coleman_liau_index
> > round(5.89 * letter_count / word_count - 0.3 * sentence_count /
> > word_count - 15.8, NumDecimalPlaces)
> > end
> >
> > def smog_index
> > round(1.043 * Math.sqrt((words_with_three_syllables * (30 /
> > sentence_count)) + 3.1291), NumDecimalPlaces)
> > end
> >
> > def automated_readability_index
> > round(4.71 * letter_count / word_count + 0.5 * word_count /
> > sentence_count - 21.43, NumDecimalPlaces)
> > end
> >
> > Colon = ": "
> > Separator = ", "
> > def get_indices_as_string(indices = [], diagnostics = true)
> > indices = (indices.empty? ? Titles.keys : indices)
> > str = indices.inject([]){|arr, index| arr << "#{Titles[index]}#
> > {Colon} #{self.send(index)}"; arr}.join(Separator)
> > return diagnostics ? "words#{Colon} #{word_count}#{Separator}
> > sentences#{Colon} #{sentence_count}#{Separator} characters#{Colon} #
> > {letter_count}#{Separator}" + str : str
> > end
> >
> > # private
> > def clean_text(text)
> > text.gsub!(/[,:;()-]/, ' ') # Replace commans, hyphens etc
> > (count them as spaces)
> > text.gsub!(/[\.!?]/, '.') # Unify terminators
> > text = text.strip + '.' # Add final terminator, just in case
> > it's missing.
> > text.gsub!(/[ ]*(\n|\r\n|\r)[ ]*/, ' ') # Replace new lines with
> > spaces
> > text.gsub!(/([\.])[\.\s?]+/, ".") # Check for duplicated
> > terminators
> > text.gsub!(/[ ]*([\.])/, "#{$1} ") # Pad sentence terminators
> > text.gsub!(/[ ]+/, ' ') # Remove multiple spaces
> > #$strText = preg_replace_callback('/\. [^ ]+/', create_function
> > ('$matches', 'return strtolower($matches[0]);'), $strText); // Lower
> > case all words following terminators (for gunning fog score)
> > return text.strip
> > end
> >
> > def round(num, decimals)
> > (num * 10 * decimals).round / (10 * decimals).to_f
> > end
> >
> > def letter_count
> > self.text.gsub(/[^A-Za-z]+/, '').length.to_i
> > end
> >
> > def sentence_count
> > [1, self.text.split(/\.!?/).length].max
> > end
> >
> > def word_count
> > get_words.length
> > end
> >
> > def get_words
> > @words ||= self.text.split(/\s+/)
> > end
> >
> > def average_words_per_sentence
> > word_count / sentence_count.to_f
> > end
> >
> > def average_syllables_per_word
> > total_syllables / get_words.length.to_f
> > end
> >
> > def total_syllables
> > get_words.inject(0){|sum, word| sum + syllable_count(word)}
> > end
> >
> > def words_with_three_syllables(count_proper_nouns = true)
> > get_words.inject(0) do |sum, word|
> > if syllable_count(word) >= 3
> > if count_proper_nouns
> > sum += 1
> > else
> > sum += 1 if word[0..0] == word[0..0].downcase
> > end
> > end
> > sum
> > end
> > end
> >
> > def percentage_words_with_three_syllables(count_proper_nouns =
> > true)
> > words_with_three_syllables(count_proper_nouns) / word_count.to_f
> > * 100
> > end
> >
> > ProblemWords = {
> > 'simile' => 3,
> > 'forever' => 3,
> > 'shoreline' => 2
> > }
> >
> > MultiSyllablesThatAreOne = [
> > /cial/,
> > /tia/,
> > /cius/,
> > /cious/,
> > /giu/,
> > /ion/,
> > /iou/,
> > /sia$/,
> > /[^aeiuoyt]{2,}ed$/,
> > /.ely$/,
> > /[cg]h?e[rsd]?$/,
> > /rved?$/,
> > /[aeiouy][dt]es?$/,
> > /[aeiouy][^aeiouydt]e[rsd]?$/,
> > /^[dr]e[aeiou][^aeiou]+$/, #Sorts out deal, deign etc
> > /[aeiouy]rse$/ #Purse, hears
> > ]
> >
> > UniSyllablesThatAreTwo = [
> > /ia/,
> > /riet/,
> > /dien/,
> > /iu/,
> > /io/,
> > /ii/,
> > /[aeiouym]bl$/,
> > /[aeiou]{3}/,
> > /^mc/,
> > /ism$/,
> > /([^aeiouy])\1l$/,
> > /[^l]lien/,
> > /^coa[dglx]./,
> > /[^gq]ua[^auieo]/,
> > /dnt$/,
> > /uity$/,
> > /ie(r|st)$/
> > ]
> >
> > PrefixesAndSuffixes = [
> > /^un/,
> > /^fore/,
> > /ly$/,
> > /less$/,
> > /ful$/,
> > /ers?$/,
> > /ings?$/
> > ]
> >
> > def syllable_count(word)
> > word = word.downcase.strip
> > #handle problem words first
> > return ProblemWords[word] if ProblemWords[word]
> >
> > #find number and delete prefixes and suffixes
> > num_syllables = PrefixesAndSuffixes.inject(0) do |sum, prefix|
> > word.scan(prefix){sum += 1}
> > word.gsub!(prefix, '')
> > sum
> > end
> >
> > #remove non-word chars
> > word.gsub!(/[^a-z]/is, '')
> >
> > #count word parts:
> > num_syllables += word.split(/[^aeiouy]+/).inject(0){|sum,
> > word_part| sum + (word_part.blank? ? 0 : 1)}
> >
> > #subtract out syllables that are really one:
> > MultiSyllablesThatAreOne.each{|syl| word.scan(syl){num_syllables
> > -= 1}}
> >
> > #add syllables that are really two:
> > UniSyllablesThatAreTwo.each{|syl| word.scan(syl){num_syllables
> > += 1}}
> >
> > return [1, num_syllables].max
> > end
> > end
> > end
> >
> > ############### RSpec tests
> > include ReadabilityIndices
> >
> > describe "readability indices" do
> > before(:each) do
> > @readability_blank = Readability.new
> > end
> >
> > it "should count simple syllable words correctly" do
> > @readability_blank.syllable_count('a').should == 1
> > @readability_blank.syllable_count('was').should == 1
> > @readability_blank.syllable_count('the').should == 1
> > @readability_blank.syllable_count('and').should == 1
> > @readability_blank.syllable_count('foobar').should == 2
> > @readability_blank.syllable_count('hello').should == 2
> > @readability_blank.syllable_count('world').should == 1
> > @readability_blank.syllable_count('wonderful').should == 3
> > @readability_blank.syllable_count('simple').should == 2
> > @readability_blank.syllable_count('easy').should == 2
> > @readability_blank.syllable_count('hard').should == 1
> > @readability_blank.syllable_count('quick').should == 1
> > @readability_blank.syllable_count('brown').should == 1
> > @readability_blank.syllable_count('fox').should == 1
> > @readability_blank.syllable_count('jumped').should == 1
> > @readability_blank.syllable_count('over').should == 2
> > @readability_blank.syllable_count('lazy').should == 2
> > @readability_blank.syllable_count('dog').should == 1
> > @readability_blank.syllable_count('camera').should == 3
> > end
> >
> > it "should count syllables on programmed exceptions" do
> > @readability_blank.syllable_count('simile').should == 3
> > @readability_blank.syllable_count('shoreline').should == 2
> > @readability_blank.syllable_count('forever').should == 3
> > end
> >
> > it "should count complex syllable words correctly" do
> > @readability_blank.syllable_count
> > ('antidisestablishmentarianism').should == 12
> > @readability_blank.syllable_count
> > ('supercalifragilisticexpialidocious').should == 14
> > @readability_blank.syllable_count
> > ('chlorofluorocarbonation').should == 8
> > @readability_blank.syllable_count('forethoughtfulness').should
> > == 4
> > @readability_blank.syllable_count('phosphorescent').should == 4
> > @readability_blank.syllable_count('theoretician').should == 5
> > @readability_blank.syllable_count('promiscuity').should == 5
> > @readability_blank.syllable_count('unbutlering').should == 4
> > @readability_blank.syllable_count('continuity').should == 5
> > @readability_blank.syllable_count('craunched').should == 1
> > @readability_blank.syllable_count('squelched').should == 1
> > @readability_blank.syllable_count('scrounge').should == 1
> > @readability_blank.syllable_count('coughed').should == 1
> > @readability_blank.syllable_count('smile').should == 1
> > @readability_blank.syllable_count('monopoly').should == 4
> > @readability_blank.syllable_count('doughey').should == 2
> > @readability_blank.syllable_count('doughier').should == 3
> > @readability_blank.syllable_count('leguminous').should == 4
> > @readability_blank.syllable_count('thoroughbreds').should == 3
> > @readability_blank.syllable_count('special').should == 2
> > @readability_blank.syllable_count('delicious').should == 3
> > @readability_blank.syllable_count('spatial').should == 2
> > @readability_blank.syllable_count('pacifism').should == 4
> > @readability_blank.syllable_count('coagulant').should == 4
> > @readability_blank.syllable_count('shouldn\'t').should == 2
> > @readability_blank.syllable_count('mcdonald').should == 3
> > @readability_blank.syllable_count('audience').should == 3
> > @readability_blank.syllable_count('finance').should == 2
> > @readability_blank.syllable_count('prevalence').should == 3
> > @readability_blank.syllable_count('impropriety').should == 5
> > @readability_blank.syllable_count('alien').should == 3
> > @readability_blank.syllable_count('dreadnought').should == 2
> > @readability_blank.syllable_count('verandah').should == 3
> > @readability_blank.syllable_count('similar').should == 3
> > @readability_blank.syllable_count('similarly').should == 4
> > @readability_blank.syllable_count('central').should == 2
> > @readability_blank.syllable_count('cyst').should == 1
> > @readability_blank.syllable_count('term').should == 1
> > @readability_blank.syllable_count('order').should == 2
> > @readability_blank.syllable_count('fur').should == 1
> > @readability_blank.syllable_count('sugar').should == 2
> > @readability_blank.syllable_count('paper').should == 2
> > @readability_blank.syllable_count('make').should == 1
> > @readability_blank.syllable_count('gem').should == 1
> > @readability_blank.syllable_count('program').should == 2
> > @readability_blank.syllable_count('hopeless').should == 2
> > @readability_blank.syllable_count('hopelessly').should == 3
> > @readability_blank.syllable_count('careful').should == 2
> > @readability_blank.syllable_count('carefully').should == 3
> > @readability_blank.syllable_count('stuffy').should == 2
> > @readability_blank.syllable_count('thistle').should == 2
> > @readability_blank.syllable_count('teacher').should == 2
> > @readability_blank.syllable_count('unhappy').should == 3
> > @readability_blank.syllable_count('ambiguity').should == 5
> > @readability_blank.syllable_count('validity').should == 4
> > @readability_blank.syllable_count('ambiguous').should == 4
> > @readability_blank.syllable_count('deserve').should == 2
> > @readability_blank.syllable_count('blooper').should == 2
> > @readability_blank.syllable_count('scooped').should == 1
> > @readability_blank.syllable_count('deserve').should == 2
> > @readability_blank.syllable_count('deal').should == 1
> > @readability_blank.syllable_count('death').should == 1
> > @readability_blank.syllable_count('dearth').should == 1
> > @readability_blank.syllable_count('deign').should == 1
> > @readability_blank.syllable_count('reign').should == 1
> > @readability_blank.syllable_count('bedsore').should == 2
> > @readability_blank.syllable_count('anorexia').should == 5
> > @readability_blank.syllable_count('anymore').should == 3
> > @readability_blank.syllable_count('cored').should == 1
> > @readability_blank.syllable_count('sore').should == 1
> > @readability_blank.syllable_count('foremost').should == 2
> > @readability_blank.syllable_count('restore').should == 2
> > @readability_blank.syllable_count('minute').should == 2
> > @readability_blank.syllable_count('manticores').should == 3
> > @readability_blank.syllable_count('asparagus').should == 4
> > @readability_blank.syllable_count('unexplored').should == 3
> > @readability_blank.syllable_count('unexploded').should == 4
> > @readability_blank.syllable_count('CAPITALS').should == 3
> > end
> >
> > it "should calculate average syllables per word" do
> > Readability.new('and then there was
> > one').average_syllables_per_word.should == 1
> > Readability.new('because special ducklings deserve
> > rainbows').average_syllables_per_word.should == 2
> > Readability.new('and then there was one because special
> > ducklings deserve rainbows').average_syllables_per_word.should ==
> > 1.5
> > end
> >
> > it "should count words correctly" do
> > Readability.new('The quick brown fox jumped over the lazy
> > dog').word_count.should == 9
> > Readability.new('The quick brown fox jumped over the lazy
> > dog.').word_count.should == 9
> > Readability.new('The quick brown fox jumped over the lazy dog.
> > ').word_count.should == 9
> > Readability.new(' The quick brown fox jumped over the lazy dog.
> > ').word_count.should == 9
> > Readability.new(' The quick brown fox jumped over the lazy dog.
> > ').word_count.should == 9
> > Readability.new('Yes. No.').word_count.should == 2
> > Readability.new('Yes.No.').word_count.should == 2
> > Readability.new('Yes.No.').word_count.should == 2
> > Readability.new('Yes . No.').word_count.should == 2
> > Readability.new('Yes - No. ').word_count.should == 2
> > end
> >
> > it "should get percentage of words with three syllables" do
> > Readability.new('there is just one word with three syllables in
> > this sentence').percentage_words_with_three_syllables.round.should ==
> > 9
> > Readability.new('there are no valid words with three Syllables
> > in this sentence').percentage_words_with_three_syllables.round.should
> > == 9
> > Readability.new('there is one and only one word with three or
> > more syllables in this long boring sentence of twenty
> > words').percentage_words_with_three_syllables.round.should == 5
> > Readability.new('there are two and only two words with three or
> > more syllables in this long sentence of exactly twenty
> > words').percentage_words_with_three_syllables.round.should == 10
> > Readability.new('there is Actually only one valid word with
> > three or more syllables in this long sentence of Exactly twenty
> > words').percentage_words_with_three_syllables(false).round.should == 5
> > Readability.new('no long words in this
> > sentence').percentage_words_with_three_syllables.round.should == 0
> > Readability.new('no long valid words in this sentence because
> > the test ignores proper case words like this
> > Behemoth').percentage_words_with_three_syllables(false).round.should
> > == 0
> > end
> >
> > it "should count letters" do
> > Readability.new('a').letter_count.should == 1
> > Readability.new('').letter_count.should == 0
> > Readability.new('this sentence has 30 characters, not including
> > the digits').letter_count.should == 46
> > end
> >
> > it "should count sentences" do
> > Readability.new('This is a sentence').sentence_count.should == 1
> > Readability.new('This is a sentence.').sentence_count.should ==
> > 1
> > Readability.new('This is a sentence!').sentence_count.should ==
> > 1
> > Readability.new('This is a sentence?').sentence_count.should ==
> > 1
> > Readability.new('This is a sentence..').sentence_count.should ==
> > 1
> > Readability.new('This is a sentence. So is
> > this.').sentence_count.should == 2
> > Readability.new("This is a sentence. \n\n So is this, but this
> > is multi-line!").sentence_count.should == 2
> > Readability.new('This is a sentence,. So is
> > this.').sentence_count.should == 2
> > Readability.new('This is a sentence!? So is
> > this.').sentence_count.should == 2
> > Readability.new('This is a sentence. So is this. And this one as
> > well.').sentence_count.should == 3
> > Readability.new('This is a sentence - but just
> > one.').sentence_count.should == 1
> > Readability.new('This is a sentence (but just
> > one).').sentence_count.should == 1
> > end
> >
> > it "should calculate average words per sentence" do
> > Readability.new('This is a
> > sentence').average_words_per_sentence.should == 4
> > Readability.new('This is a
> > sentence.').average_words_per_sentence.should == 4
> > Readability.new('This is a sentence.
> > ').average_words_per_sentence.should == 4
> > Readability.new('This is a sentence. This is a
> > sentence').average_words_per_sentence.should == 4
> > Readability.new('This is a sentence. This is a
> > sentence.').average_words_per_sentence.should == 4
> > Readability.new('This, is - a sentence . This is a sentence.
> > ').average_words_per_sentence.should == 4
> > Readability.new('This is a sentence with extra text. This is a
> > sentence. ').average_words_per_sentence.should == 5.5
> > Readability.new('This is a sentence with some extra text. This
> > is a sentence. ').average_words_per_sentence.should == 6
> > end
> >
> > describe "test indices directly" do
> > before(:each) do
> > @str_a = 'This. Is. A. Nice. Set. Of. Small. Words. Of. One.
> > Part. Each.'
> > @str_b = 'The quick brown fox jumped over the lazy dog.'
> > @str_c = 'The quick brown fox jumped over the lazy dog. The
> > quick brown fox jumped over the lazy dog.'
> > @str_d = "The quick brown fox jumped over the lazy dog. \n\n
> > The quick brown fox jumped over the lazy dog."
> > @str_e = 'The quick brown fox jumped over the lazy dog. The
> > quick brown fox jumped over the lazy dog'
> > @str_f = 'Now it is time for a more complicated sentence,
> > including several longer words.'
> > @str_g = 'Now it is time for a more Complicated sentence,
> > including Several longer words.'
> > end
> >
> > it "should calculate flesch-kincaid reading ease" do
> > Readability.new(@str_a).flesch_kincaid_reading_ease.should ==
> > 121.2
> > Readability.new(@str_b).flesch_kincaid_reading_ease.should ==
> > 94.3
> > Readability.new(@str_c).flesch_kincaid_reading_ease.should ==
> > 94.3
> > Readability.new(@str_d).flesch_kincaid_reading_ease.should ==
> > 94.3
> > Readability.new(@str_e).flesch_kincaid_reading_ease.should ==
> > 94.3
> > Readability.new(@str_f).flesch_kincaid_reading_ease.should ==
> > 50.5
> > end
> >
> > it "should calculate flesch-kincaid grade level" do
> > Readability.new(@str_a).flesch_kincaid_grade_level.should ==
> > -3.4
> > Readability.new(@str_b).flesch_kincaid_grade_level.should ==
> > 2.3
> > Readability.new(@str_c).flesch_kincaid_grade_level.should ==
> > 2.3
> > Readability.new(@str_d).flesch_kincaid_grade_level.should ==
> > 2.3
> > Readability.new(@str_e).flesch_kincaid_grade_level.should ==
> > 2.3
> > Readability.new(@str_f).flesch_kincaid_grade_level.should ==
> > 9.4
> > end
> >
> > it "should calculate Gunning-Fog Score" do
> > Readability.new(@str_a).gunning_fog_score.should == 0.4
> > Readability.new(@str_b).gunning_fog_score.should == 3.6
> > Readability.new(@str_c).gunning_fog_score.should == 3.6
> > Readability.new(@str_d).gunning_fog_score.should == 3.6
> > Readability.new(@str_e).gunning_fog_score.should == 3.6
> > Readability.new(@str_f).gunning_fog_score.should == 14.4
> > Readability.new(@str_g).gunning_fog_score.should == 8.3
> > end
> >
> > it "should calculate coleman-liau index" do
> > Readability.new(@str_a).coleman_liau_index.should == 3.0
> > Readability.new(@str_b).coleman_liau_index.should == 7.7
> > Readability.new(@str_c).coleman_liau_index.should == 7.7
> > Readability.new(@str_d).coleman_liau_index.should == 7.7
> > Readability.new(@str_e).coleman_liau_index.should == 7.7
> > Readability.new(@str_f).coleman_liau_index.should ==
> > 13.6
> > end
> >
> > it "should calculate smog index" do
> > Readability.new(@str_a).smog_index.should == 1.8
> > Readability.new(@str_b).smog_index.should == 1.8
> > Readability.new(@str_c).smog_index.should == 1.8
> > Readability.new(@str_d).smog_index.should == 1.8
> > Readability.new(@str_e).smog_index.should == 1.8
> > Readability.new(@str_f).smog_index.should ==
> > 10.1
> > end
> >
> > it "should calculate automated readability index" do
> > Readability.new(@str_a).automated_readability_index.should ==
> > -5.6
> > Readability.new(@str_b).automated_readability_index.should ==
> > 1.9
> > Readability.new(@str_c).automated_readability_index.should ==
> > 1.9
> > Readability.new(@str_d).automated_readability_index.should ==
> > 1.9
> > Readability.new(@str_e).automated_readability_index.should ==
> > 1.9
> > Readability.new(@str_f).automated_readability_index.should ==
> > 8.6
> > end
> >
> > it "should index first paragraph of Moby Dick correctly" do
> > str =<<-ENDL
> > Call me Ishmael. Some years ago - never mind how long
> > precisely - having little or no money in my purse, and
> > nothing particular to interest me on shore, I thought I
> > would sail about a little and see the watery part of
> > the world. It is a way I have of driving off the spleen, and
> > regulating the circulation. Whenever I find myself
> > growing grim about the mouth; whenever it is a damp, drizzly
> > November in my soul; whenever I find myself
> > involuntarily pausing before coffin warehouses, and bringing
> > up the rear of every funeral I meet; and especially
> > whenever my hypos get such an upper hand of me, that it
> > requires a strong moral principle to prevent me from
> > deliberately stepping into the street, and methodically
> > knocking people's hats off - then, I account it high time
> > to get to sea as soon as I can. This is my substitute for
> > pistol and ball. With a philosophical flourish Cato
> > throws himself upon his sword; I quietly take to the ship.
> > There is nothing surprising in this. If they but knew
> > it, almost all men in their degree, some time or other,
> > cherish very nearly the same feelings towards the ocean with me.
> > ENDL
> >
> > readability = Readability.new(str)
> >
> > readability.letter_count.should == 884
> > readability.word_count.should == 201
> > readability.total_syllables.should == 304
> > readability.sentence_count.should == 8
> > readability.words_with_three_syllables.should == 23
> >
> > readability.flesch_kincaid_grade_level.should == 12.1
> > readability.flesch_kincaid_reading_ease.should == 53.4
> > readability.gunning_fog_score.should == 14.2
> > readability.coleman_liau_index.should == 10.1
> > readability.smog_index.should == 8.9
> > readability.automated_readability_index.should == 11.8
> > end
> >
> > it "should index a Kipling poem correctly" do
> > str =<<-ENDL
> > If you can keep your head when all about you
> > Are losing theirs and blaming it on you,
> > If you can trust yourself when all men doubt you
> > But make allowance for their doubting too,
> > If you can wait and not be tired by waiting,
> > Or being lied about, don't deal in lies,
> > Or being hated, don't give way to hating,
> > And yet don't look too good, nor talk too wise:
> >
> > If you can dream - and not make dreams your master,
> > If you can think - and not make thoughts your aim;
> > If you can meet with Triumph and Disaster
> > And treat those two impostors just the same;
> > If you can bear to hear the truth you've spoken
> > Twisted by knaves to make a trap for fools,
> > Or watch the things you gave your life to, broken,
> > And stoop and build 'em up with worn-out tools:
> >
> > If you can make one heap of all your winnings
> > And risk it all on one turn of pitch-and-toss,
> > And lose, and start again at your beginnings
> > And never breath a word about your loss;
> > If you can force your heart and nerve and sinew
> > To serve your turn long after they are gone,
> > And so hold on when there is nothing in you
> > Except the Will which says to them: "Hold on"
> >
> > If you can talk with crowds and keep your virtue,
> > Or walk with kings - nor lose the common touch,
> > If neither foes nor loving friends can hurt you;
> > If all men count with you, but none too much,
> > If you can fill the unforgiving minute
> > With sixty seconds' worth of distance run,
> > Yours is the Earth and everything that's in it,
> > And - which is more - you'll be a Man, my son
> > ENDL
> >
> > readability = Readability.new(str)
> >
> > readability.letter_count.should == 1125
> > readability.word_count.should == 292
> > readability.total_syllables.should == 338
> > readability.sentence_count.should == 1
> > readability.words_with_three_syllables.should == 6
> >
> > readability.flesch_kincaid_grade_level.should == 111.9
> > readability.flesch_kincaid_reading_ease.should == -187.5
> > readability.gunning_fog_score.should == 117.5
> > readability.coleman_liau_index.should == 6.9
> > readability.smog_index.should == 14.1
> > readability.automated_readability_index.should == 142.7
> > end
> > end
> > end
> >
> > >
> >
>
> >
>
------=_Part_245913_5335230.1231353997326
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
My pleasure.<br><br>I couldn't figure out how to translate that one lin=
e in the clean text function:<br>#$strText =3D preg_replace_callback('/=
\. [^ ]+/', create_function<br>
('$matches', 'return strtolower($matches[0]);'), $strText);=
// Lower<br>
case all words following terminators (for gunning fog score)<br><br>So I gu=
ess that's the reason why my Gunning-Fog is off a bit.<br><br>Would you=
mind adding these two files to the repository? I won't have any time f=
or the next two months.<br>
<br>Thanks!<br>Adam<br><br><br><div class=3D"gmail_quote">On Wed, Jan 7, 20=
09 at 2:02 AM, David Child <span dir=3D"ltr"><<a href=3D"mailto:dave@add=
edbytes.com">d...@addedbytes.com</a>></span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, 204, 204); margin=
: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>
Great work Adam!<br>
<div><div></div><div class=3D"Wj3C7c"><br>
On Tue, Jan 6, 2009 at 11:16 PM, way4thesub <<a href=3D"mailto:kapelner@=
gmail.com">kapel...@gmail.com</a>> wrote:<br>
><br>
> Hello all,<br>
><br>
> I've translated the php-text-statistics package to Ruby, you can v=
iew<br>
> the files below. Please note I couldn't get the Gunning Fog Score =
to<br>
> work 100%<br>
><br>
> Regards,<br>
> Adam<br>
><br>
><br>
> ############### Code<br>
> require 'collections/sequenced_hash'<br>
><br>
> module ReadabilityIndices<br>
><br>
> class Readability<br>
><br>
> NumDecimalPlaces =3D 1<br>
><br>
> Titles =3D SequencedHash.new<br>
> Titles[:flesch_kincaid_grade_level] =3D 'Flesch-Kinca=
id Grade level'<br>
> Titles[:flesch_kincaid_reading_ease] =3D 'Flesch-Kinc=
aid Reading<br>
> Ease'<br>
> Titles[:gunning_fog_score] =3D 'Gunning-Fog score'=
;<br>
> Titles[:coleman_liau_index] =3D 'Coleman-Liau Index&#=
39;<br>
> Titles[:smog_index] =3D 'SMOG Index'<br>
> Titles[:automated_readability_index] =3D 'Automated R=
eadability<br>
> Index'<br>
><br>
> attr_accessor :text<br>
> def initialize(text =3D '')<br>
> self.text =3D clean_text(text)<br>
> end<br>
><br>
> def valid_index?(index)<br>
> Titles[index] ? true : false<br>
> end<br>
><br>
> def flesch_kincaid_grade_level<br>
> round(0.39 * average_words_per_sentence + 11.8 *<b=
r>
> average_syllables_per_word - 15.59, NumDecimalPlaces)<br>
> end<br>
><br>
> def flesch_kincaid_reading_ease<br>
> round(206.835 - 1.015 * average_words_per_sentence=
- 84.6 *<br>
> average_syllables_per_word, NumDecimalPlaces)<br>
> end<br>
><br>
> def gunning_fog_score<br>
> round((average_words_per_sentence +<br>
> percentage_words_with_three_syllables(false)) * 0.4, NumDecimalPlaces)=
<br>
> end<br>
><br>
> def coleman_liau_index<br>
> round(5.89 * letter_count / word_count - 0.3 * sen=
tence_count /<br>
> word_count - 15.8, NumDecimalPlaces)<br>
> end<br>
><br>
> def smog_index<br>
> round(1.043 * Math.sqrt((words_with_three_syllable=
s * (30 /<br>
> sentence_count)) + 3.1291), NumDecimalPlaces)<br>
> end<br>
><br>
> def automated_readability_index<br>
> round(4.71 * letter_count / word_count + 0.5 * wor=
d_count /<br>
> sentence_count - 21.43, NumDecimalPlaces)<br>
> end<br>
><br>
> Colon =3D ": "<br>
> Separator =3D ", "<br>
> def get_indices_as_string(indices =3D [], diagnostics =3D=
true)<br>
> indices =3D (indices.empty? ? Titles.keys : indice=
s)<br>
> str =3D indices.inject([]){|arr, index| arr <&l=
t; "#{Titles[index]}#<br>
> {Colon} #{self.send(index)}"; arr}.join(Separator)<br>
> return diagnostics ? "words#{Colon} #{word_co=
unt}#{Separator}<br>
> sentences#{Colon} #{sentence_count}#{Separator} characters#{Colon} #<b=
r>
> {letter_count}#{Separator}" + str : str<br>
> end<br>
><br>
> # private<br>
> def clean_text(text)<br>
> text.gsub!(/[,:;()-]/, ' ') # Replace comm=
ans, hyphens etc<br>
> (count them as spaces)<br>
> text.gsub!(/[\.!?]/, '.') # Unify terminat=
ors<br>
> text =3D text.strip + '.' # Add final term=
inator, just in case<br>
> it's missing.<br>
> text.gsub!(/[ ]*(\n|\r\n|\r)[ ]*/, ' ') # =
Replace new lines with<br>
> spaces<br>
> text.gsub!(/([\.])[\.\s?]+/, ".") # Chec=
k for duplicated<br>
> terminators<br>
> text.gsub!(/[ ]*([\.])/, "#{$1} ") # Pad=
sentence terminators<br>
> text.gsub!(/[ ]+/, ' ') # Remove multiple =
spaces<br>
> #$strText =3D preg_replace_callback('/\. [^ ]+=
/', create_function<br>
> ('$matches', 'return strtolower($matches[0]);'), $strT=
ext); // Lower<br>
> case all words following terminators (for gunning fog score)<br>
> return text.strip<br>
> end<br>
><br>
> def round(num, decimals)<br>
> (num * 10 * decimals).round / (10 * decimals).to_f=
<br>
> end<br>
><br>
> def letter_count<br>
> self.text.gsub(/[^A-Za-z]+/, '').length.to=
_i<br>
> end<br>
><br>
> def sentence_count<br>
> [1, self.text.split(/\.!?/).length].max<br>
> end<br>
><br>
> def word_count<br>
> get_words.length<br>
> end<br>
><br>
> def get_words<br>
> @words ||=3D self.text.split(/\s+/)<br>
> end<br>
><br>
> def average_words_per_sentence<br>
> word_count / sentence_count.to_f<br>
> end<br>
><br>
> def average_syllables_per_word<br>
> total_syllables / get_words.length.to_f<br>
> end<br>
><br>
> def total_syllables<br>
> get_words.inject(0){|sum, word| sum + syllable_cou=
nt(word)}<br>
> end<br>
><br>
> def words_with_three_syllables(count_proper_nouns =3D tru=
e)<br>
> get_words.inject(0) do |sum, word|<br>
> if syllable_count(word) >=3D 3<br>
> if count_proper_nouns<br>
> sum +=3D 1<br>
> else<br>
> sum +=3D 1 if word[0..0] =3D=
=3D word[0..0].downcase<br>
> end<br>
> end<br>
> sum<br>
> end<br>
> end<br>
><br>
> def percentage_words_with_three_syllables(count_proper_no=
uns =3D<br>
> true)<br>
> words_with_three_syllables(count_proper_nouns) / w=
ord_count.to_f<br>
> * 100<br>
> end<br>
><br>
> ProblemWords =3D {<br>
> 'simile' =3D> 3,<br>
> 'forever' =3D> 3,<br>
> 'shoreline' =3D> 2<br>
> }<br>
><br>
> MultiSyllablesThatAreOne =3D [<br>
> /cial/,<br>
> /tia/,<br>
> /cius/,<br>
> /cious/,<br>
> /giu/,<br>
> /ion/,<br>
> /iou/,<br>
> /sia$/,<br>
> /[^aeiuoyt]{2,}ed$/,<br>
> /.ely$/,<br>
> /[cg]h?e[rsd]?$/,<br>
> /rved?$/,<br>
> /[aeiouy][dt]es?$/,<br>
> /[aeiouy][^aeiouydt]e[rsd]?$/,<br>
> /^[dr]e[aeiou][^aeiou]+$/, #Sorts out deal, deign =
etc<br>
> /[aeiouy]rse$/ #Purse, hears<br>
> ]<br>
><br>
> UniSyllablesThatAreTwo =3D [<br>
> /ia/,<br>
> /riet/,<br>
> /dien/,<br>
> /iu/,<br>
> /io/,<br>
> /ii/,<br>
> /[aeiouym]bl$/,<br>
> /[aeiou]{3}/,<br>
> /^mc/,<br>
> /ism$/,<br>
> /([^aeiouy])\1l$/,<br>
> /[^l]lien/,<br>
> /^coa[dglx]./,<br>
> /[^gq]ua[^auieo]/,<br>
> /dnt$/,<br>
> /uity$/,<br>
> /ie(r|st)$/<br>
> ]<br>
><br>
> PrefixesAndSuffixes =3D [<br>
> /^un/,<br>
> /^fore/,<br>
> /ly$/,<br>
> /less$/,<br>
> /ful$/,<br>
> /ers?$/,<br>
> /ings?$/<br>
> ]<br>
><br>
> def syllable_count(word)<br>
> word =3D word.downcase.strip<br>
> #handle problem words first<br>
> return ProblemWords[word] if ProblemWords[word]<br=
>
><br>
> #find number and delete prefixes and suffixes<br>
> num_syllables =3D PrefixesAndSuffixes.inject(0) do=
|sum, prefix|<br>
> word.scan(prefix){sum +=3D 1}<br>
> word.gsub!(prefix, '')<br>
> sum<br>
> end<br>
><br>
> #remove non-word chars<br>
> word.gsub!(/[^a-z]/is, '')<br>
><br>
> #count word parts:<br>
> num_syllables +=3D word.split(/[^aeiouy]+/).inject=
(0){|sum,<br>
> word_part| sum + (word_part.blank? ? 0 : 1)}<br>
><br>
> #subtract out syllables that are really one:<br>
> MultiSyllablesThatAreOne.each{|syl| word.scan(syl)=
{num_syllables<br>
> -=3D 1}}<br>
><br>
> #add syllables that are really two:<br>
> UniSyllablesThatAreTwo.each{|syl| word.scan(syl){n=
um_syllables<br>
> +=3D 1}}<br>
><br>
> return [1, num_syllables].max<br>
> end<br>
> end<br>
> end<br>
><br>
> ############### RSpec tests<br>
> include ReadabilityIndices<br>
><br>
> describe "readability indices" do<br>
> before(:each) do<br>
> @readability_blank =3D Readability.new<br>
> end<br>
><br>
> it "should count simple syllable words correctly&quo=
t; do<br>
> @readability_blank.syllable_count('a').sho=
uld =3D=3D 1<br>
> @readability_blank.syllable_count('was').s=
hould =3D=3D 1<br>
> @readability_blank.syllable_count('the').s=
hould =3D=3D 1<br>
> @readability_blank.syllable_count('and').s=
hould =3D=3D 1<br>
> @readability_blank.syllable_count('foobar'=
).should =3D=3D 2<br>
> @readability_blank.syllable_count('hello')=
.should =3D=3D 2<br>
> @readability_blank.syllable_count('world')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('wonderful&#=
39;).should =3D=3D 3<br>
> @readability_blank.syllable_count('simple'=
).should =3D=3D 2<br>
> @readability_blank.syllable_count('easy').=
should =3D=3D 2<br>
> @readability_blank.syllable_count('hard').=
should =3D=3D 1<br>
> @readability_blank.syllable_count('quick')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('brown')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('fox').s=
hould =3D=3D 1<br>
> @readability_blank.syllable_count('jumped'=
).should =3D=3D 1<br>
> @readability_blank.syllable_count('over').=
should =3D=3D 2<br>
> @readability_blank.syllable_count('lazy').=
should =3D=3D 2<br>
> @readability_blank.syllable_count('dog').s=
hould =3D=3D 1<br>
> @readability_blank.syllable_count('camera'=
).should =3D=3D 3<br>
> end<br>
><br>
> it "should count syllables on programmed exceptions&=
quot; do<br>
> @readability_blank.syllable_count('simile'=
).should =3D=3D 3<br>
> @readability_blank.syllable_count('shoreline&#=
39;).should =3D=3D 2<br>
> @readability_blank.syllable_count('forever'=
;).should =3D=3D 3<br>
> end<br>
><br>
> it "should count complex syllable words correctly&qu=
ot; do<br>
> @readability_blank.syllable_count<br>
> ('antidisestablishmentarianism').should =3D=3D 12<br>
> @readability_blank.syllable_count<br>
> ('supercalifragilisticexpialidocious').should =3D=3D 14<br>
> @readability_blank.syllable_count<br>
> ('chlorofluorocarbonation').should =3D=3D 8<br>
> @readability_blank.syllable_count('forethought=
fulness').should<br>
> =3D=3D 4<br>
> @readability_blank.syllable_count('phosphoresc=
ent').should =3D=3D 4<br>
> @readability_blank.syllable_count('theoreticia=
n').should =3D=3D 5<br>
> @readability_blank.syllable_count('promiscuity=
').should =3D=3D 5<br>
> @readability_blank.syllable_count('unbutlering=
').should =3D=3D 4<br>
> @readability_blank.syllable_count('continuity&=
#39;).should =3D=3D 5<br>
> @readability_blank.syllable_count('craunched&#=
39;).should =3D=3D 1<br>
> @readability_blank.syllable_count('squelched&#=
39;).should =3D=3D 1<br>
> @readability_blank.syllable_count('scrounge=
9;).should =3D=3D 1<br>
> @readability_blank.syllable_count('coughed'=
;).should =3D=3D 1<br>
> @readability_blank.syllable_count('smile')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('monopoly=
9;).should =3D=3D 4<br>
> @readability_blank.syllable_count('doughey'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('doughier=
9;).should =3D=3D 3<br>
> @readability_blank.syllable_count('leguminous&=
#39;).should =3D=3D 4<br>
> @readability_blank.syllable_count('thoroughbre=
ds').should =3D=3D 3<br>
> @readability_blank.syllable_count('special'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('delicious&#=
39;).should =3D=3D 3<br>
> @readability_blank.syllable_count('spatial'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('pacifism=
9;).should =3D=3D 4<br>
> @readability_blank.syllable_count('coagulant&#=
39;).should =3D=3D 4<br>
> @readability_blank.syllable_count('shouldn\=
9;t').should =3D=3D 2<br>
> @readability_blank.syllable_count('mcdonald=
9;).should =3D=3D 3<br>
> @readability_blank.syllable_count('audience=
9;).should =3D=3D 3<br>
> @readability_blank.syllable_count('finance'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('prevalence&=
#39;).should =3D=3D 3<br>
> @readability_blank.syllable_count('impropriety=
').should =3D=3D 5<br>
> @readability_blank.syllable_count('alien')=
.should =3D=3D 3<br>
> @readability_blank.syllable_count('dreadnought=
').should =3D=3D 2<br>
> @readability_blank.syllable_count('verandah=
9;).should =3D=3D 3<br>
> @readability_blank.syllable_count('similar'=
;).should =3D=3D 3<br>
> @readability_blank.syllable_count('similarly&#=
39;).should =3D=3D 4<br>
> @readability_blank.syllable_count('central'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('cyst').=
should =3D=3D 1<br>
> @readability_blank.syllable_count('term').=
should =3D=3D 1<br>
> @readability_blank.syllable_count('order')=
.should =3D=3D 2<br>
> @readability_blank.syllable_count('fur').s=
hould =3D=3D 1<br>
> @readability_blank.syllable_count('sugar')=
.should =3D=3D 2<br>
> @readability_blank.syllable_count('paper')=
.should =3D=3D 2<br>
> @readability_blank.syllable_count('make').=
should =3D=3D 1<br>
> @readability_blank.syllable_count('gem').s=
hould =3D=3D 1<br>
> @readability_blank.syllable_count('program'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('hopeless=
9;).should =3D=3D 2<br>
> @readability_blank.syllable_count('hopelessly&=
#39;).should =3D=3D 3<br>
> @readability_blank.syllable_count('careful'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('carefully&#=
39;).should =3D=3D 3<br>
> @readability_blank.syllable_count('stuffy'=
).should =3D=3D 2<br>
> @readability_blank.syllable_count('thistle'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('teacher'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('unhappy'=
;).should =3D=3D 3<br>
> @readability_blank.syllable_count('ambiguity&#=
39;).should =3D=3D 5<br>
> @readability_blank.syllable_count('validity=
9;).should =3D=3D 4<br>
> @readability_blank.syllable_count('ambiguous&#=
39;).should =3D=3D 4<br>
> @readability_blank.syllable_count('deserve'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('blooper'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('scooped'=
;).should =3D=3D 1<br>
> @readability_blank.syllable_count('deserve'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('deal').=
should =3D=3D 1<br>
> @readability_blank.syllable_count('death')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('dearth'=
).should =3D=3D 1<br>
> @readability_blank.syllable_count('deign')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('reign')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('bedsore'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('anorexia=
9;).should =3D=3D 5<br>
> @readability_blank.syllable_count('anymore'=
;).should =3D=3D 3<br>
> @readability_blank.syllable_count('cored')=
.should =3D=3D 1<br>
> @readability_blank.syllable_count('sore').=
should =3D=3D 1<br>
> @readability_blank.syllable_count('foremost=
9;).should =3D=3D 2<br>
> @readability_blank.syllable_count('restore'=
;).should =3D=3D 2<br>
> @readability_blank.syllable_count('minute'=
).should =3D=3D 2<br>
> @readability_blank.syllable_count('manticores&=
#39;).should =3D=3D 3<br>
> @readability_blank.syllable_count('asparagus&#=
39;).should =3D=3D 4<br>
> @readability_blank.syllable_count('unexplored&=
#39;).should =3D=3D 3<br>
> @readability_blank.syllable_count('unexploded&=
#39;).should =3D=3D 4<br>
> @readability_blank.syllable_count('CAPITALS=
9;).should =3D=3D 3<br>
> end<br>
><br>
> it "should calculate average syllables per word"=
; do<br>
> Readability.new('and then there was<br>
> one').average_syllables_per_word.should =3D=3D 1<br>
> Readability.new('because special ducklings des=
erve<br>
> rainbows').average_syllables_per_word.should =3D=3D 2<br>
> Readability.new('and then there was one becaus=
e special<br>
> ducklings deserve rainbows').average_syllables_per_word.should =3D=
=3D<br>
> 1.5<br>
> end<br>
><br>
> it "should count words correctly" do<br>
> Readability.new('The quick brown fox jumped ov=
er the lazy<br>
> dog').word_count.should =3D=3D 9<br>
> Readability.new('The quick brown fox jumped ov=
er the lazy<br>
> dog.').word_count.should =3D=3D 9<br>
> Readability.new('The quick brown fox jumped ov=
er the lazy dog.<br>
> ').word_count.should =3D=3D 9<br>
> Readability.new(' The quick brown fox jumped o=
ver the lazy dog.<br>
> ').word_count.should =3D=3D 9<br>
> Readability.new(' The quick brown fox ju=
mped over the lazy dog.<br>
> ').word_count.should =3D=3D 9<br>
> Readability.new('Yes. No.').word_count.sho=
uld =3D=3D 2<br>
> Readability.new('Yes.No.').word_count.shou=
ld =3D=3D 2<br>
> Readability.new('Yes.No.').word_count.shou=
ld =3D=3D 2<br>
> Readability.new('Yes . No.').word_count.sh=
ould =3D=3D 2<br>
> Readability.new('Yes - No. ').word_count.s=
hould =3D=3D 2<br>
> end<br>
><br>
> it "should get percentage of words with three syllab=
les" do<br>
> Readability.new('there is just one word with t=
hree syllables in<br>
> this sentence').percentage_words_with_three_syllables.round.should=
=3D=3D<br>
> 9<br>
> Readability.new('there are no valid words with=
three Syllables<br>
> in this sentence').percentage_words_with_three_syllables.round.sho=
uld<br>
> =3D=3D 9<br>
> Readability.new('there is one and only one wor=
d with three or<br>
> more syllables in this long boring sentence of twenty<br>
> words').percentage_words_with_three_syllables.round.should =3D=3D =
5<br>
> Readability.new('there are two and only two wo=
rds with three or<br>
> more syllables in this long sentence of exactly twenty<br>
> words').percentage_words_with_three_syllables.round.should =3D=3D =
10<br>
> Readability.new('there is Actually only one va=
lid word with<br>
> three or more syllables in this long sentence of Exactly twenty<br>
> words').percentage_words_with_three_syllables(false).round.should =
=3D=3D 5<br>
> Readability.new('no long words in this<br>
> sentence').percentage_words_with_three_syllables.round.should =3D=
=3D 0<br>
> Readability.new('no long valid words in this s=
entence because<br>
> the test ignores proper case words like this<br>
> Behemoth').percentage_words_with_three_syllables(false).round.shou=
ld<br>
> =3D=3D 0<br>
> end<br>
><br>
> it "should count letters" do<br>
> Readability.new('a').letter_count.should =
=3D=3D 1<br>
> Readability.new('').letter_count.should =
=3D=3D 0<br>
> Readability.new('this sentence has 30 characte=
rs, not including<br>
> the digits').letter_count.should =3D=3D 46<br>
> end<br>
><br>
> it "should count sentences" do<br>
> Readability.new('This is a sentence').sent=
ence_count.should =3D=3D 1<br>
> Readability.new('This is a sentence.').sen=
tence_count.should =3D=3D<br>
> 1<br>
> Readability.new('This is a sentence!').sen=
tence_count.should =3D=3D<br>
> 1<br>
> Readability.new('This is a sentence?').sen=
tence_count.should =3D=3D<br>
> 1<br>
> Readability.new('This is a sentence..').se=
ntence_count.should =3D=3D<br>
> 1<br>
> Readability.new('This is a sentence. So is<br>
> this.').sentence_count.should =3D=3D 2<br>
> Readability.new("This is a sentence. \n\n So =
is this, but this<br>
> is multi-line!").sentence_count.should =3D=3D 2<br>
> Readability.new('This is a sentence,. So is<br=
>
> this.').sentence_count.should =3D=3D 2<br>
> Readability.new('This is a sentence!? So is<br=
>
> this.').sentence_count.should =3D=3D 2<br>
> Readability.new('This is a sentence. So is thi=
s. And this one as<br>
> well.').sentence_count.should =3D=3D 3<br>
> Readability.new('This is a sentence - but just=
<br>
> one.').sentence_count.should =3D=3D 1<br>
> Readability.new('This is a sentence (but just<=
br>
> one).').sentence_count.should =3D=3D 1<br>
> end<br>
><br>
> it "should calculate average words per sentence"=
; do<br>
> Readability.new('This is a<br>
> sentence').average_words_per_sentence.should =3D=3D 4<br>
> Readability.new('This is a<br>
> sentence.').average_words_per_sentence.should =3D=3D 4<br>
> Readability.new('This is a sentence.<br>
> ').average_words_per_sentence.should =3D=3D 4<br>
> Readability.new('This is a sentence. This is a=
<br>
> sentence').average_words_per_sentence.should =3D=3D 4<br>
> Readability.new('This is a sentence. This is a=
<br>
> sentence.').average_words_per_sentence.should =3D=3D 4<br>
> Readability.new('This, is - a sentence . This =
is a sentence.<br>
> ').average_words_per_sentence.should =3D=3D 4<br>
> Readability.new('This is a sentence with extra=
text. This is a<br>
> sentence. ').average_words_per_sentence.should =3D=3D 5.5<br>
> Readability.new('This is a sentence with some =
extra text. This<br>
> is a sentence. ').average_words_per_sentence.should =3D=3D 6<br>
> end<br>
><br>
> describe "test indices directly" do<br>
> before(:each) do<br>
> @str_a =3D 'This. Is. A. Nice. Set. Of.=
Small. Words. Of. One.<br>
> Part. Each.'<br>
> @str_b =3D 'The quick brown fox jumped =
over the lazy dog.'<br>
> @str_c =3D 'The quick brown fox jumped =
over the lazy dog. The<br>
> quick brown fox jumped over the lazy dog.'<br>
> @str_d =3D "The quick brown fox jumped=
over the lazy dog. \n\n<br>
> The quick brown fox jumped over the lazy dog."<br>
> @str_e =3D 'The quick brown fox jumped =
over the lazy dog. The<br>
> quick brown fox jumped over the lazy dog'<br>
> @str_f =3D 'Now it is time for a more c=
omplicated sentence,<br>
> including several longer words.'<br>
> @str_g =3D 'Now it is time for a more C=
omplicated sentence,<br>
> including Several longer words.'<br>
> end<br>
><br>
> it "should calculate flesch-kincaid reading e=
ase" do<br>
> Readability.new(@str_a).flesch_kincaid_read=
ing_ease.should =3D=3D<br>
> 121.2<br>
> Readability.new(@str_b).flesch_kincaid_read=
ing_ease.should =3D=3D<br>
> 94.3<br>
> Readability.new(@str_c).flesch_kincaid_read=
ing_ease.should =3D=3D<br>
> 94.3<br>
> Readability.new(@str_d).flesch_kincaid_read=
ing_ease.should =3D=3D<br>
> 94.3<br>
> Readability.new(@str_e).flesch_kincaid_read=
ing_ease.should =3D=3D<br>
> 94.3<br>
> Readability.new(@str_f).flesch_kincaid_read=
ing_ease.should =3D=3D<br>
> 50.5<br>
> end<br>
><br>
> it "should calculate flesch-kincaid grade lev=
el" do<br>
> Readability.new(@str_a).flesch_kincaid_grad=
e_level.should =3D=3D<br>
> -3.4<br>
> Readability.new(@str_b).flesch_kincaid_grad=
e_level.should =3D=3D<br>
> 2.3<br>
> Readability.new(@str_c).flesch_kincaid_grad=
e_level.should =3D=3D<br>
> 2.3<br>
> Readability.new(@str_d).flesch_kincaid_grad=
e_level.should =3D=3D<br>
> 2.3<br>
> Readability.new(@str_e).flesch_kincaid_grad=
e_level.should =3D=3D<br>
> 2.3<br>
> Readability.new(@str_f).flesch_kincaid_grad=
e_level.should =3D=3D<br>
> 9.4<br>
> end<br>
><br>
> it "should calculate Gunning-Fog Score" =
do<br>
> Readability.new(@str_a).gunning_fog_score.s=
hould =3D=3D 0.4<br>
> Readability.new(@str_b).gunning_fog_score.s=
hould =3D=3D 3.6<br>
> Readability.new(@str_c).gunning_fog_score.s=
hould =3D=3D 3.6<br>
> Readability.new(@str_d).gunning_fog_score.s=
hould =3D=3D 3.6<br>
> Readability.new(@str_e).gunning_fog_score.s=
hould =3D=3D 3.6<br>
> Readability.new(@str_f).gunning_fog_score.s=
hould =3D=3D 14.4<br>
> Readability.new(@str_g).gunning_fog_score.s=
hould =3D=3D 8.3<br>
> end<br>
><br>
> it "should calculate coleman-liau index"=
do<br>
> Readability.new(@str_a).coleman_liau_index.=
should =3D=3D 3.0<br>
> Readability.new(@str_b).coleman_liau_index.=
should =3D=3D 7.7<br>
> Readability.new(@str_c).coleman_liau_index.=
should =3D=3D 7.7<br>
> Readability.new(@str_d).coleman_liau_index.=
should =3D=3D 7.7<br>
> Readability.new(@str_e).coleman_liau_index.=
should =3D=3D 7.7<br>
> Readability.new(@str_f).coleman_liau_index.=
should =3D=3D<br>
> 13.6<br>
> end<br>
><br>
> it "should calculate smog index" do<br>
> Readability.new(@str_a).smog_index.should =
=3D=3D 1.8<br>
> Readability.new(@str_b).smog_index.should =
=3D=3D 1.8<br>
> Readability.new(@str_c).smog_index.should =
=3D=3D 1.8<br>
> Readability.new(@str_d).smog_index.should =
=3D=3D 1.8<br>
> Readability.new(@str_e).smog_index.should =
=3D=3D 1.8<br>
> Readability.new(@str_f).smog_index.should =
=3D=3D<br>
> 10.1<br>
> end<br>
><br>
> it "should calculate automated readability in=
dex" do<br>
> Readability.new(@str_a).automated_readabili=
ty_index.should =3D=3D<br>
> -5.6<br>
> Readability.new(@str_b).automated_readabili=
ty_index.should =3D=3D<br>
> 1.9<br>
> Readability.new(@str_c).automated_readabili=
ty_index.should =3D=3D<br>
> 1.9<br>
> Readability.new(@str_d).automated_readabili=
ty_index.should =3D=3D<br>
> 1.9<br>
> Readability.new(@str_e).automated_readabili=
ty_index.should =3D=3D<br>
> 1.9<br>
> Readability.new(@str_f).automated_readabili=
ty_index.should =3D=3D<br>
> 8.6<br>
> end<br>
><br>
> it "should index first paragraph of Moby Dick=
correctly" do<br>
> str =3D<<-ENDL<br>
> Call me Ishmael. Some years ago - ne=
ver mind how long<br>
> precisely - having little or no money in my purse, and<br>
> nothing particular to interest me on=
shore, I thought I<br>
> would sail about a little and see the watery part of<br>
> the world. It is a way I have of dri=
ving off the spleen, and<br>
> regulating the circulation. Whenever I find myself<br>
> growing grim about the mouth; whenev=
er it is a damp, drizzly<br>
> November in my soul; whenever I find myself<br>
> involuntarily pausing before coffin =
warehouses, and bringing<br>
> up the rear of every funeral I meet; and especially<br>
> whenever my hypos get such an upper =
hand of me, that it<br>
> requires a strong moral principle to prevent me from<br>
> deliberately stepping into the stree=
t, and methodically<br>
> knocking people's hats off - then, I account it high time<br>
> to get to sea as soon as I can. This=
is my substitute for<br>
> pistol and ball. With a philosophical flourish Cato<br>
> throws himself upon his sword; I qui=
etly take to the ship.<br>
> There is nothing surprising in this. If they but knew<br>
> it, almost all men in their degree, =
some time or other,<br>
> cherish very nearly the same feelings towards the ocean with me.<br>
> ENDL<br>
><br>
> readability =3D Readability.new(str)<br>
><br>
> readability.letter_count.should =3D=3D 884<=
br>
> readability.word_count.should =3D=3D 201<br=
>
> readability.total_syllables.should =3D=3D 3=
04<br>
> readability.sentence_count.should =3D=3D 8<=
br>
> readability.words_with_three_syllables.shou=
ld =3D=3D 23<br>
><br>
> readability.flesch_kincaid_grade_level.shou=
ld =3D=3D 12.1<br>
> readability.flesch_kincaid_reading_ease.sho=
uld =3D=3D 53.4<br>
> readability.gunning_fog_score.should =3D=3D=
14.2<br>
> readability.coleman_liau_index.should =3D=
=3D 10.1<br>
> readability.smog_index.should =3D=3D 8.9<br=
>
> readability.automated_readability_index.sho=
uld =3D=3D 11.8<br>
> end<br>
><br>
> it "should index a Kipling poem correctly&quo=
t; do<br>
> str =3D<<-ENDL<br>
> If you can keep your head when all a=
bout you<br>
> Are losing theirs and blaming it on =
you,<br>
> If you can trust yourself when all m=
en doubt you<br>
> But make allowance for their doubtin=
g too,<br>
> If you can wait and not be tired by =
waiting,<br>
> Or being lied about, don't deal =
in lies,<br>
> Or being hated, don't give way t=
o hating,<br>
> And yet don't look too good, nor=
talk too wise:<br>
><br>
> If you can dream - and not make drea=
ms your master,<br>
> If you can think - and not make thou=
ghts your aim;<br>
> If you can meet with Triumph and Dis=
aster<br>
> And treat those two impostors just t=
he same;<br>
> If you can bear to hear the truth yo=
u've spoken<br>
> Twisted by knaves to make a trap for=
fools,<br>
> Or watch the things you gave your li=
fe to, broken,<br>
> And stoop and build 'em up with =
worn-out tools:<br>
><br>
> If you can make one heap of all your=
winnings<br>
> And risk it all on one turn of pitch=
-and-toss,<br>
> And lose, and start again at your be=
ginnings<br>
> And never breath a word about your l=
oss;<br>
> If you can force your heart and nerv=
e and sinew<br>
> To serve your turn long after they a=
re gone,<br>
> And so hold on when there is nothing=
in you<br>
> Except the Will which says to them: =
"Hold on"<br>
><br>
> If you can talk with crowds and keep=
your virtue,<br>
> Or walk with kings - nor lose the co=
mmon touch,<br>
> If neither foes nor loving friends c=
an hurt you;<br>
> If all men count with you, but none =
too much,<br>
> If you can fill the unforgiving minu=
te<br>
> With sixty seconds' worth of dis=
tance run,<br>
> Yours is the Earth and everything th=
at's in it,<br>
> And - which is more - you'll be =
a Man, my son<br>
> ENDL<br>
><br>
> readability =3D Readability.new(str)<br>
><br>
> readability.letter_count.should =3D=3D 1125=
<br>
> readability.word_count.should =3D=3D 292<br=
>
> readability.total_syllables.should =3D=3D 3=
38<br>
> readability.sentence_count.should =3D=3D 1<=
br>
> readability.words_with_three_syllables.shou=
ld =3D=3D 6<br>
><br>
> readability.flesch_kincaid_grade_level.shou=
ld =3D=3D 111.9<br>
> readability.flesch_kincaid_reading_ease.sho=
uld =3D=3D -187.5<br>
> readability.gunning_fog_score.should =3D=3D=
117.5<br>
> readability.coleman_liau_index.should =3D=
=3D 6.9<br>
> readability.smog_index.should =3D=3D 14.1<b=
r>
> readability.automated_readability_index.sho=
uld =3D=3D 142.7<br>
> end<br>
> end<br>
> end<br>
><br>
> ><br>
><br>
<br>
<br>
</div></div></blockquote></div><br>
------=_Part_245913_5335230.1231353997326--