Browse is a language used to build powerful libraries while keeping the end-user experience as simple as possible (think bash). Browse achieves this by treating thunks as first class citizens (Think of a thunk as an intent to call a function). This facilitates library development by allowing library creators to implement complex behavior with minimal changes to the end user experience. To show off Browse, we built a library called “web” which aims to make web scraping, browser automation and UI testing simple.
Rules and RuleSets
In order to implement first class thunks we use Rules and RuleSets. A Rule is an intent to execute some action, and a RuleSet is a collections of Rules.
Example
# The print rule being used
print "Hello World" # prints "Hello World" when evaluated
# A RuleSet
{
print "Hello"
print "World"
}
Every line of code in Browse begins with a rule name. So the above code can't be executed. However, we can pass a RuleSet into a rule.
# Evaluate the rules in the RuleSet sequentially
eval {
print "Hello"
print "World"
}
# Evaluate the rules in the RuleSet sequentially, but in reverse
eval(reverse) {
print "Hello"
print "World"
}
RuleSets are what Browse uses to represent Thunks.
At the top level of a browse program, every Rule is evaluated sequentially, but higher-order rules (Rules which take RuleSets as arguments) can change that behavior
Apart from the obvious decision to make RuleSets first class citizens, there are a few other key decisions worth noting (Roughly ranked in order of importance).
Implementing most language features as Rules
while, if, and for are all rules
Typically these are built into the language
We proved that higher order Rules can look like they're a part of the language
To consume arguments to a function, we use a rule called bind
Typically arguments to a function are automatically consumed
Argument handling is normally managed by the language (i.e. pass-by-{name|value|reference})
The bind rule lets the rule's author define when and how arguments should be consumed
This even allows the author customize the meaning of thunk-composition
Normally, f.g === f(g(...)). But, in browse, f gets access to the thunk for g instead of just receiving a value
Arrays and Dictionaries are implemented as RuleSets using special rules to set elements
Typically these are built into the language
Special "subscript" syntax to access array and dictionary values can be implemented as syntactic sugar
Expressions only (Every Rule returns a value)
Implicit return
The eval rule evaluates the RuleSet passed as an argument.
Since every rule must return a value, eval returns the value returned from the last rule in the RuleSet
This means that return is implicit, however, we do expose a rule called return that is an alias for id (the identity rule). This works nearly as well and helps when reading the code
Named Arguments/Options
We love how most bash programs take optional arguments as flags (-h or --force) etc.
However, we didn't like how flags could be interspersed with positional arguments which leads to ambiguity
So, we added a native syntax for passing options for rules. It looks like this
my_rule(double !yell prefix="Ans: ") 2
# This sets `double` to true, `yell` to false, and `prefix` to "Ans: "
# Look at the wiki to see how rules can `bind` options and use them
Unquoted Strings
Inspired by bash,Q browse supports unquoted strings
print hello world
Optional Semicolons
Inspired by bash, the absence of semicolons generally results in cleaner code
That being said, it’s not too hard to dream up cases where semicolons could come in handy so they are optional.
Examples
If you’re confused about what some of these functions do, check this out for the standard browse rules, and this for the web rules.
# A Fibonacci rule
rule fib {
# Bind "n" to the first argument
bind n
if ($n <= 1) then {
# Base case
return 1
} else {
# Recursion
return (fib $n - 1) + (fib $n - 2)
}
}
# Run the program
print (fib 5)
# Pass `--web` to browse when running this example
# $ browse --web ./examples/web/wikipedia.browse
page https://en.wikipedia.org/wiki/:slug {
# Grab a string from the webpage and store it in a variable called 'title'
# Note '@' is not a special symbol. The name of the rule happens to be '@string' that's all
@string title `#firstHeading`
# Grab an array of strings from the webpage and store them in paragraphs
@arr(string) paragraphs `div.mw-parser-output`
out title paragraphs
# uncomment this to infinitely crawl through wikipedia
# crawl `a`
}
# Start the crawl
visit https://en.wikipedia.org/wiki/Kevin_Bacon
page https://www.thegazette.co.uk/notice/:issue {
print $url
config { set output "./notices/" + $issue + ".json" }
wait `.wrapperContent`
@string title `h1.title`
@string? date `dd time`
@string? notice `div[about="this:notifiableThing"]`
out title date notice
}
visit https://www.thegazette.co.uk/all-notices/notice?text=&categorycode-all=all¬icetypes=&location-postcode-1=&location-distance-1=1&location-local-authority-1=&numberOfLocationSearches=1&start-publish-date=01%2F01%2F2000&end-publish-date=12%2F08%2F2020&edition=&london-issue=&edinburgh-issue=&belfast-issue=&sort-by=&results-page-size=10
# Also fetch these
for { set i 2; test $i < 5; set i $i + 1 } {
visit https://www.thegazette.co.uk/London/issue/ + $i + "/page/2"
}
set headless false
page https://www.twitch.tv {
# Click link for full code
set logIn ...
set username ...
set birthMonth ...
set birthDay ...
set birthYear ...
type 'RandomTwitchUser31415'
wait $logIn
click $logIn
wait $username
type 'RandomTwitchUser31415'
click '#password-input'
type 'jfnosenfjksef'
click '#password-input-confirmation'
type 'jfnosenfjksef'
click $birthMonth
type 'apr'
press Enter
click $birthDay
type '29'
click $birthYear
type '1997'
click '#email-input'
type '[email protected]'
sleep 1000 * 10 # 10s
}
visit https://www.twitch.tv/
Browse - Build expressive libraries with thunks
The Main Idea
Browse is a language used to build powerful libraries while keeping the end-user experience as simple as possible (think bash). Browse achieves this by treating thunks as first class citizens (Think of a thunk as an intent to call a function). This facilitates library development by allowing library creators to implement complex behavior with minimal changes to the end user experience. To show off Browse, we built a library called “web” which aims to make web scraping, browser automation and UI testing simple.
Rules and RuleSets
In order to implement first class thunks we use Rules and RuleSets. A Rule is an intent to execute some action, and a RuleSet is a collections of Rules.
Example
Every line of code in Browse begins with a rule name. So the above code can't be executed. However, we can pass a RuleSet into a rule.
RuleSets are what Browse uses to represent Thunks.
Complete Documentation
👉 Check this short Wiki. We put a ton of work into it 👈
Some Design Decisions
Apart from the obvious decision to make RuleSets first class citizens, there are a few other key decisions worth noting (Roughly ranked in order of importance).
Implementing most language features as Rules
while
,if
, andfor
are all ruleswhile
is implemented completely in browsebind
bind
rule lets the rule's author define when and how arguments should be consumedf.g === f(g(...))
. But, in browse,f
gets access to the thunk forg
instead of just receiving a valueExpressions only (Every Rule returns a value)
Implicit
return
eval
rule evaluates the RuleSet passed as an argument.eval
returns the value returned from the last rule in the RuleSetreturn
is implicit, however, we do expose a rule calledreturn
that is an alias forid
(the identity rule). This works nearly as well and helps when reading the codeNamed Arguments/Options
-h
or--force
) etc.Unquoted Strings
Optional Semicolons
Examples
If you’re confused about what some of these functions do, check this out for the standard browse rules, and this for the web rules.
Fibonacci
Higher order rules
Conway's Game of Life
Web Examples
Here are some examples of web scraping scripts written in browse:
Wikipedia Scraper
The Gazette Scraper
Twitch Sign Up
More Examples
Here
Technical Trade offs
Prototyping the language in javascript
Implementing language features as rules
The Roadmap
Result
type)?
and!
to control error handling. Stay tunedbind
and non-linear control flowLanguage Support
browse format
Team @windsorio
@atfaust2 💻 📖 🖋 🤔
@pranaygp 💻 📖 🎨 🤔
Interesting! This reminds me of a similar language (with similar features) called Tcl
@theangryepicbanana That's really cool. Gonna have to steal a bunch of ideas ;)