Table of Contents
Below are the materials from my today’s Installfest talk. https://pretalx.installfest.cz/installfest-2022/talk/Z8TP3D/
1. Characteristics
- New language – announced 2012
- Goal: Solve the “two language” problem
- Prototype in Python, rewrite in C++ for speed
- Compiled language (LLVM)
- Good for interactive work (JIT)
- Dynamic types + type inference
- Garbage collected
- Multiple dispatch paradigm
- one function can have different implementations based on type of all arguments
- Lisp-like macros
- Target domain:
- Originally: Scientific computations
- Nowadays: General purpose, Data Science, Visualisation, …
1.1. Speed
Source: https://youtu.be/LT4AP7CUMAw
1.2. My success story
- Load & process 2GB CSV file
- Julia: 40s, 5 GB of memory
- Python Pandas: 5 min, OUT OF MEMORY
- Python Pandas + big machine: 15 min, 36 GB of memory
2. Installation
- distribution package managers
- download archive, unzip/untar, run
- console application
2.1. IDE: Julia plugin for VS Code
3. Calculator
1 + 1 3 * 5
4. REPL modes
Special characters at the start of the line switch modes:
Help
?atan
Shell
;ls
Package manager
]add IJulia
Return with backspace
5. Variables
x = 1 x # Julia likes Unicode :-) π # Enter as \pi<TAB> α = atan(1/2) 3α + 100 # You can skip multiply operator
6. Vectors, Arrays
a=[10,20,30] a[1] # one-based indexing a[2:3] a[2:end] b=[1 2; 3 4] c=[5 6 7 8]
6.1. Element-wise operations – broadcasting
a=rand(10_000_000) b=rand(10_000_000) a*b # ❌ a.*b sin(a) # ❌ sin.(a) # Broadcasting [sin(i) for i in a] # Comprehension # Broadcasting is faster – no need to # allocate intermediate memory @time sin.(a) .+ cos.(b); @time [sin(i) for i in a] .+ [cos(i) for i in b]; # In-place assignment (no memory allocation) c = zeros(size(a)) @time c .= sin.(a).^2 .+ cos.(b).^2; # Tired of writing dots? @. c = sin(a)^2 + cos(b)^2
6.2. GPU arrays
- Available APIs
- CUDA.jl
- AMDGPU.jl
- oneAPI.jl (Intel)
High-level (array) API:
using oneAPI oneAPI.versioninfo() a=rand(100_000_000) b=rand(100_000_000) c=zeros(size(a)) @time c .= a .* b; ga = oneArray(a) gb = oneArray(b) gc = oneArray(c) @time gc .= ga .* gb;
- Possibility to write GPU kernels in Julia
7. Types
Julia uses dynamic types + type inference.
typeof(1) typeof(1.0) typeof(Int8(1)) typeof("Hello world")
7.1. Parametric types
1//3 typeof(ans) # ans = result of previous line typeof(Int8(1)//Int8(3)) typeof([1,2,3]) typeof([1 2 3]) typeof(Int8[1,2,3]) typeof([1 2;3 4.1])
7.2. Abstract types and type hierarchy
“is subtype” operator: <:
Integer <: Number Int8 <: Integer Int8 <: Number Float32 <: Number Float32 <: Integer
7.3. User-defined types
mutable struct MyType id::Int64 text::String end x = MyType(10, "Hello") # Define addition of our type Base.:+(a::MyType, b::MyType) = MyType(a.id + b.id, "Sum") x + MyType(1, "xxx")
8. Functions
# Mathematics-like definition square(x) = x*x square(4) # Programming-like definition function hello(name) return "Hello " * name end hello("Michal") square("Hello") # Operators are functions too +(1,2) →(a,b) = (a+b)/(a-b) 3→1
8.1. Anonymous functions
Functional programming features.
x->x^2 ans(5) # call a=rand(10) # Return elements > 0.5 filter(x->x>0.5, a) [x for x in a if x > 0.5] a[a .> 0.5]
8.2. Methods and multiple dispatch
Functions can have multiple methods (implementations). The method to call is selected based on types of all arguments.
fun(x::Number) = "Number "*string(x) methods(fun) fun(1) fun(1.1) fun("1.1") fun(x::String) = "String "*x methods(fun) fun("1.1") # Functions can have many methods methods(+) methods(show)
8.3. Multimedia I/O (show examples)
Values can be presented in different formats:
@doc atan typeof(@doc atan) show(stdout, MIME("text/plain"), @doc atan) show(stdout, MIME("text/html"), @doc atan)
9. Macros
- Inspired by Lisp
- Work at abstract syntax tree level.
- Can rewrite/generate programs.
- Often used for domain specific languages
- No separate macro language, macros are written in Julia itself.
9.1. Examples
9.1.1. Unit testing
using Test @test 1 + 1 == 2
9.1.2. Code simplification
sqrt(abs(sin(1))) # Pipe syntax (for unary functions only) 1 |> sin |> abs |> sqrt rnd = rand(10) sort(rnd, rev=true) .+ 1 # Pipes with higher-arity functions ⇒ lambdas rnd |> x -> sort(x, rev=true) |> x -> x .+ 1 using Pipe # Piped value represented by underscore @pipe rnd |> sort(_, rev=true) |> _ .+ 1
- Similar: Chains.jl, DataFramesMeta.jl, …
@chain df begin dropmissing filter(:id => >(6), _) groupby(:group) combine(:age => sum) end
9.2. Benchmarking, code inspection, optimization
@time rand(Int(1e6)); using BenchmarkTools @benchmark rand(Int(1e6)) @code_native sum(1:5) a = [1, 2, 3] # Don’t perform bounds checking @inbounds a[2]
10. Showcase
Example of what is possible with the language. This is not builtin functionality. Everything is programmed in Julia.
10.1. Measurements
Computation with confidence intervals
using Measurements
a = 5.2 ± 1.0
typeof(a)
b = 3.7 ± 1.0
a + b
a * b
a / b
10.2. Unitful
using Unitful using Unitful.DefaultSymbols using Unitful: hr 1m + 3cm |> cm |> float sin(90) sin(90°) sin(π/2) 15m/3s 10km/hr |> m/s 10km/hr |> m/s |> float 0°C |> K |> float
11. Plotting
- Plots supports multiple plotting backends (e.g. Python matplotlib).
- Infamous “Time to first plot” – much better today
using Plots
plot(sin.(0:0.1:2π))
- One of several available interfaces to Gnuplot.
- Faster than Plots
using Gnuplot @gp sin.(0:0.1:2π) "with lines lw 5" "set grid"
12. Package management
- Fast development, breaking changes (packages with version < 1.0)
- Reproducible environments (projects)
- Which packages and versions
- Project.toml, Manifest.toml
- Needs manual setup
12.1. Package manager
pwd() # Switch to package manager ] ? # create a new project or activate existing activate . add Pipe ;cat Project.toml cat Manifest.toml
12.2. Using packages, modules
Module = namespace
module MyMod export x x=1 y=2 end x MyMod.x
- Package is a git repo with certain structure
# introduce exported symbols to current namespace using Package # introduce just the symbol Package to current namespace import Package
13. Tasks & Channels
- Easy to use parallelism
- Similar to goroutines in Go
14. Interfacing other languages
Direct call to a function from shared library:
# libc call ccall(:clock, Int32, ()) # using other libraries ccall((:zlibVersion, "libz"), Cstring, ()) |> unsafe_string
Python code can be called transparently from Julia:
using PyCall math = pyimport("math") math.sin(math.pi / 4) # returns ≈ 1/√2 = 0.70710678...
15. Interactive notebooks
- Jupyter vs. Pluto.jl
- Pluto is something between Jupyter and Excel
15.1. Jupyter
using IJulia
notebook()
15.2. Pluto.jl
import Pluto
Pluto.run()
15.3. Feature comparison
Feature | Jupyter | Pluto.jl |
---|---|---|
Languages | many | Julia |
File format | JSON | Julia script with comments |
Results | Stored in JSON | Available only at runtime |
Execution order | Top-down/manual | Dependency-based |
Cell updates | Manual | Automatic |
Package management | No | Yes, reproducible |
16. Dataframes.jl
- Work with tabular data, named columns
- Easy import from CSV (CSV.jl)
17. Conclusion
- ➕ “Simple”, fast, versatile language
- ➕ A lot of packages available
- ➕ Active community
- ➖ Some packages are not mature, breaking changes
- ➖ Compilation can be slow (new session)