class EmailReplyParser::Email
An Email instance represents a parsed body String.
Constants
- EMPTY
- SIGNATURE
- SIG_REGEX
Attributes
Emails have an Array of Fragments.
Public Class Methods
# File lib/email_reply_parser.rb, line 60 def initialize @fragments = [] end
Public Instance Methods
Splits the given text into a list of Fragments. This is roughly done by reversing the text and parsing from the bottom to the top. This way we can check for 'On <date>, <author> wrote:' lines above quoted blocks.
text - A String email body.
Returns this same Email instance.
# File lib/email_reply_parser.rb, line 78 def read(text) # in 1.9 we want to operate on the raw bytes text = text.dup.force_encoding('binary') if text.respond_to?(:force_encoding) # Normalize line endings. text.gsub!("\r\n", "\n") # Check for multi-line reply headers. Some clients break up # the "On DATE, NAME <EMAIL> wrote:" line into multiple lines. if text =~ /^(?!On.*On\s.+?wrote:)(On\s(.+?)wrote:)$/nm # Remove all new lines from the reply header. text.gsub! $1, $1.gsub("\n", " ") end # Some users may reply directly above a line of underscores. # In order to ensure that these fragments are split correctly, # make sure that all lines of underscores are preceded by # at least two newline characters. text.gsub!(/([^\n])(?=\n_{7}_+)$/m, "\\1\n") # The text is reversed initially due to the way we check for hidden # fragments. text = text.reverse # This determines if any 'visible' Fragment has been found. Once any # visible Fragment is found, stop looking for hidden ones. @found_visible = false # This instance variable points to the current Fragment. If the matched # line fits, it should be added to this Fragment. Otherwise, finish it # and start a new Fragment. @fragment = nil # Use the StringScanner to pull out each line of the email content. @scanner = StringScanner.new(text) while line = @scanner.scan_until(/\n/n) scan_line(line) end # Be sure to parse the last line of the email. if (last_line = @scanner.rest.to_s).size > 0 scan_line(last_line) end # Finish up the final fragment. Finishing a fragment will detect any # attributes (hidden, signature, reply), and join each line into a # string. finish_fragment @scanner = @fragment = nil # Now that parsing is done, reverse the order. @fragments.reverse! self end
Public: Gets the combined text of the visible fragments of the email body.
Returns a String.
# File lib/email_reply_parser.rb, line 67 def visible_text fragments.select{|f| !f.hidden?}.map{|f| f.to_s}.join("\n").rstrip end
Private Instance Methods
Builds the fragment string and reverses it, after all lines have been added. It also checks to see if this Fragment is hidden. The hidden Fragment check reads from the bottom to the top.
Any quoted Fragments or signature Fragments are marked hidden if they are below any visible Fragments. Visible Fragments are expected to contain original content by the author. If they are below a quoted Fragment, then the Fragment should be visible to give context to the reply.
some original text (visible) > do you have any two's? (quoted, visible) Go fish! (visible) > -- > Player 1 (quoted, hidden) -- Player 2 (signature, hidden)
# File lib/email_reply_parser.rb, line 217 def finish_fragment if @fragment @fragment.finish if !@found_visible if @fragment.quoted? || @fragment.signature? || @fragment.to_s.strip == EMPTY @fragment.hidden = true else @found_visible = true end end @fragments << @fragment end @fragment = nil end
Detects if a given line is a header above a quoted area. It is only checked for lines preceding quoted regions.
line - A String line of text from the email.
Returns true if the line is a valid header, or false.
# File lib/email_reply_parser.rb, line 191 def quote_header?(line) line =~ /^:etorw.*nO$/n end
Scans the given line of text and figures out which fragment it belongs to.
line - A String line of text from the email.
Returns nothing.
# File lib/email_reply_parser.rb, line 153 def scan_line(line) line.chomp!("\n") line.lstrip! unless SIG_REGEX.match(line) # We're looking for leading `>`'s to see if this line is part of a # quoted Fragment. is_quoted = !!(line =~ /(>+)$/n) # Mark the current Fragment as a signature if the current line is empty # and the Fragment starts with a common signature indicator. if @fragment && line == EMPTY if SIG_REGEX.match @fragment.lines.last @fragment.signature = true finish_fragment end end # If the line matches the current fragment, add it. Note that a common # reply header also counts as part of the quoted Fragment, even though # it doesn't start with `>`. if @fragment && ((@fragment.quoted? == is_quoted) || (@fragment.quoted? && (quote_header?(line) || line == EMPTY))) @fragment.lines << line # Otherwise, finish the fragment and start a new one. else finish_fragment @fragment = Fragment.new(is_quoted, line) end end