Explain the basics of Ruby's regexp and leave a pointer to ruby-doc.org.

Akinori MUSHA 10 anos atrás
pai
commit
39046da035
1 arquivos alterados com 3 adições e 1 exclusões
  1. 3 1
      app/models/agents/website_agent.rb

+ 3 - 1
app/models/agents/website_agent.rb

@@ -57,9 +57,11 @@ module Agents
57 57
       To extract the whole content as one event:
58 58
 
59 59
           "extract": {
60
-            "content": { "regexp": "\A(?:.|\n)*\z", index: 0 },
60
+            "content": { "regexp": "\A(?m:.)*\z", index: 0 },
61 61
           }
62 62
 
63
+      Beware that `.` does not match the newline character (LF) unless the `m` flag is in effect, and `^`/`$` basically match every line beginning/end.  See [this document](http://ruby-doc.org/core-#{RUBY_VERSION}/doc/regexp_rdoc.html) to learn the regular expression variant used in this service.
64
+
63 65
       Note that for all of the formats, whatever you extract MUST have the same number of matches for each extractor.  E.g., if you're extracting rows, all extractors must match all rows.  For generating CSS selectors, something like [SelectorGadget](http://selectorgadget.com) may be helpful.
64 66
 
65 67
       Can be configured to use HTTP basic auth by including the `basic_auth` parameter with `"username:password"`, or `["username", "password"]`.