Posts about python

GitHub and GitLab for newbies

I wrote a git tutorial for those who don't know git where I tried to explain how to use Git for version control on your local machine.

Of course those of you who know about these things already know that half the fun of git is not using it locally, but using a server that can centralize the develpment and allow collaboration.

Well, good news! I just wrote the chapter where I cover that part!

Read and let me know what you think:

Git Hosting

Playing With Picolisp (Part 1)

I want to learn new languages. But new as in "new to me", not new as in "created last week". So I decided to play with the grandaddy of all cool languages, LISP. Created in 1958, it's even older than I, which is good because it's experienced.

One "problem" with LISP is that there are a million LISPs. You can use Scheme or Common Lisp, or Emacs' Lisp, or a bazillion others. I wanted something simple so it was supposed to be Scheme... but a few days ago I ran into something called Picolisp and it sounded so cool.

Read more…

Playing with Nim

A few days ago I saw a mention in twitter about a language called Nim

And... why not. I am a bit stale in my programming language variety. I used to be fluent in a dozen, now I do 80% python 10% go, some JS and almost nothing else. Because I learn by doing, I decided to do something. Because I did not want a problem I did not know how to solve to get in the way of the language, I decided to reimplement the example for the python book I am writing: a text layout engine that outputs SVG, based on harfbuzz, freetype2 and other things.

This is a good learning project for me, because a lot of my coding is glueing things together, I hardly ever do things from scratch.

So, I decided to start in somewhat random order.

Preparation

I read the Nim Tutorial quickly. I ended referring to it and to Nim by example a lot. While trying out a new language one is bound to forget syntax. It happens.

Wrote a few "hello world" 5 line programs to see that the ecosystem was installed correctly. Impression: builds are fast-ish. THey can get actually fast if you start using tcc instead of gcc.

SVG Output

I looked for libraries that were the equivalent of svgwrite, which I am using on the python side. Sadly, such a thing doesn't seem to exist for nim. So, I wrote my own. It's very rudimentary, and surely the nim code is garbage for experienced nim coders, but I ended using the xmltree module of nim's standard library and everything!

import xmltree
import strtabs
import strformat

type
        Drawing* = tuple[fname: string, document: XmlNode]

proc NewDrawing*(fname: string, height:string="100", width:string="100"): Drawing =
        result = (
            fname: fname,
            document: <>svg()
        )
        var attrs = newStringTable()
        attrs["baseProfile"] = "full"
        attrs["version"] = "1.1"
        attrs["xmlns"] = "http://www.w3.org/2000/svg"
        attrs["xmlns:ev"] = "http://www.w3.org/2001/xml-events"
        attrs["xmlns:xlink"] = "http://www.w3.org/1999/xlink"
        attrs["height"] = height
        attrs["width"] = width
        result.document.attrs = attrs

proc Add*(d: Drawing, node: XmlNode): void =
        d.document.add(node)

proc Rect*(x: string, y: string, width: string, height: string, fill: string="blue"): XmlNode =
        result = <>rect(
            x=x,
            y=y,
            width=width,
            height=height,
            fill=fill
        )

proc Text*(text: string, x: string, y: string, font_size: string, font_family: string="Arial"): XmlNode =
        result = <>text(newText(text))
        var attrs = newStringTable()
        attrs["x"] = x
        attrs["y"] = y
        attrs["font-size"] = font_size
        attrs["font-family"] = font_family
        result.attrs = attrs

proc Save*(d:Drawing): void =
   writeFile(d.fname,xmlHeader & $(d.document))

when isMainModule:
        var d = NewDrawing("foo.svg", width=fmt"{width}cm", height=fmt"{height}cm")
        d.Add(Rect("10cm","10cm","15cm","15cm","white"))
        d.Add(Text("HOLA","12cm","12cm","2cm"))
        d.Save()

While writing this I ran into a few issues abd saw a few nice things:

To build a svg tag, you can use <>svg(attr=value) which is delightful syntax. But what happens if the attr is "xmlns:ev"? That is not a valid identifier, so it doesn't work. So I worked around it by creating a StringTable filling it and setting all attributes at once.

A good thing is the when keyword. usingit as when isMainModule means that code is built and executed when svgwrite.nim is built standalone, and not when used as a module.

Another good thing is the syntax sugar for what in python we would call "object's methods".

Because Add takes a Drawing as first argument, you can just call d.Add() if d is a Drawing. Is simple, it's clear and it's useful and I like it.

One bad thing is that sometimes importing a module will cause weird errors that are hard to guess. For example, this simplified version fails to build:

import xmltree

type
        Drawing* = tuple[fname: string, document: XmlNode]

proc NewDrawing*(fname: string, height:string="100", width:string="100"): Drawing =
        result = (
            fname: fname,
            document: <>svg(width=width, height=height)
        )

when isMainModule:
        var d = NewDrawing("foo.svg")
$ nim c  svg1.nim
Hint: used config file '/etc/nim.cfg' [Conf]
Hint: system [Processing]
Hint: svg1 [Processing]
Hint: xmltree [Processing]
Hint: macros [Processing]
Hint: strtabs [Processing]
Hint: hashes [Processing]
Hint: strutils [Processing]
Hint: parseutils [Processing]
Hint: math [Processing]
Hint: algorithm [Processing]
Hint: os [Processing]
Hint: times [Processing]
Hint: posix [Processing]
Hint: ospaths [Processing]
svg1.nim(9, 19) template/generic instantiation from here
lib/nim/core/macros.nim(556, 26) Error: undeclared identifier: 'newStringTable'

WAT? I am not using newStringTable anywhere! The solution is to add import strtabs which defines it, but there is really no way to guess which imports will trigger this sort of issue. If it's possible that importing a random module will trigger some weird failure with something that is not part of the stdlib and I need to figure it out... it can hurt.

In any case: it worked! My first working, useful nim code!

Doing a script with options / parameters

In my python version I was using docopt and this was smooth: there is a nim version of docopt and using it was as easy as:

  1. nimble install docopt
  2. import docopt in the script

The usage is remarkably similar to python:

import docopt
when isMainModule:
        let doc = """Usage:
        boxes <input> <output> [--page-size=<WxH>] [--separation=<sep>]
        boxes --version"""

        let arguments = docopt(doc, version="Boxes 0.13")
        var (w,h) = (30f, 50f)
        if arguments["--page-size"]:
            let sizes = ($arguments["--page-size"]).split("x")
            w = parse_float(sizes[0])
            h = parse_float(sizes[1])

        var separation = 0.05
        if arguments["--separation"]:
            separation = parse_float($arguments["--separation"])
        var input = $arguments["<input>"]
        var output = $arguments["<output>"]

Not much to say, other that the code for parsing --page-size is slightly less graceful than I would like because I can't figure out how to split the string and convert to float at once.

So, at that point I sort of have the skeleton of the program done. The missing pieces are calling harfbuzz and freetype2 to figure out text sizes and so on.

Interfacing with C libs

One of the main selling points of Nim is that it interfaces with C and C++ in a striaghtforward manner. So, since nobody had wrapped harfbuzz until now, I could try to do it myself!

First I tried to get c2nim working, since it's the recommended way to do it. Sadly, the version of nim that ships in Arch is not able to build c2nim via nimble, and I ended having to manually build nim-git and c2nim-git ... which took quite a while to get right.

And then c2nim just failed.

So then I tried to do it manually. It started well!

  • To link libraries you just use pragmas: {.link: "/usr/lib/libharfbuzz.so".}

  • To declare types which are equivalent to void * just use distinct pointer

  • To declare a function just do some gymanstics:

    proc create*(): Buffer {.header: "harfbuzz/hb.h", importc: "hb_buffer_$1" .}

  • Creates a nim function called create (the * means it's "exported")

  • It is a wrapper around hb_buffer_create (see the syntax there? That is nice!)

  • Says it's declared in C in "harfbuzz/hb.h"

  • It returns a Buffer which is declared thus:

type
    Buffer* = distinct pointer

Here is all I could do trying to wrap what I needed:

{.link: "/usr/lib/libharfbuzz.so".}
{.pragma: ftimport, cdecl, importc, dynlib: "/usr/lib/libfreetype.so.6".}

type
        Buffer* = distinct pointer
        Face* = distinct pointer
        Font* = distinct pointer

        FT_Library*   = distinct pointer
        FT_Face*   = distinct pointer
        FT_Error* = cint

proc create*(): Buffer {.header: "harfbuzz/hb.h", importc: "hb_buffer_$1" .}
proc add_utf8*(buffer: Buffer, text: cstring, textLength:int, item_offset:int, itemLength:int) {.importc: "hb_buffer_$1", nodecl.}
proc guess_segment_properties*( buffer: Buffer): void {.header: "harfbuzz/hb.h", importc: "hb_buffer_$1" .}
proc create_referenced(face: FT_Face): Font {.header: "harfbuzz/hb.h", importc: "hb_ft_font_$1" .}
proc shape(font: Font, buf: Buffer, features: pointer, num_features: int): void {.header: "harfbuzz/hb.h", importc: "hb_$1" .}

proc FT_Init_FreeType*(library: var FT_Library): FT_Error {.ft_import.}
proc FT_Done_FreeType*(library: FT_Library): FT_Error {.ft_import.}
proc FT_New_Face*(library: FT_Library, path: cstring, face_index: clong, face: var FT_Face): FT_Error {.ft_import.}
proc FT_Set_Char_Size(face: FT_Face, width: float, height: float, h_res: int, v_res: int): FT_Error {.ft_import.}

var buf: Buffer = create()
buf.add_utf8("Hello", -1, 0, -1)
buf.guess_segment_properties()

var library: FT_Library
assert(0 == FT_Init_FreeType(library))
var face: FT_Face
assert(0 == FT_New_Face(library,"/usr/share/fonts/ttf-linux-libertine/LinLibertine_R.otf", 0, face))
assert(0 == face.FT_Set_Char_Size(1, 1, 64, 64))
var font = face.create_referenced()
font.shape(buf, nil, 0)

Sadly, this segfaults and I have no idea how to debug it. It's probably close to right? Maybe some nim coder can figure it out and help me?

In any case, conclusion time!

Conclusions

  • I like the language
  • I like the syntax
  • nimble, the package manager is cool
  • Is there an equivalent of virtualenvs? Is it necessary?
  • The C wrapping is, indeed, easy. When it works.
  • The availability of 3rd party code is of course not as large as with other languages
  • The compiling / building is cool
  • There are some strange bugs, which is to be expected
  • Tooling is ok. VSCode has a working extension for it. I miss an opinionated formatter.
  • It produces fast code.
  • It builds fast.

I will keep it in mind if I need to write fast code with limited dependencies on external libraries.

Código Charla PyDay "Como Hacer una API REST en Python, spec first"

El 4/4/2018 di una charla en un PyDay sobre como implementar una API REST a partir de una especificación hecha en Swagger/OpenAPI usando Connexion

/images/pyday-api-rest.jpg

Foto tomada por Yamila Cuestas

Si bien no pude grabar la charla (alguien en la audiencia si lo hizo, pero no me dio el video! Pasame el video, persona de la audiencia!) y no hay slides, acá está el código que mostré, que es relativamente sencillo y fácil de seguir.

Código de la charla

Cualquier cosa pregunten.

PD: Sí, podría hacer la charla en un video nuevo. Sí, me da mucha pereza.

My Git tutorial for people who don't know Git

As part of a book project aimed at almost-beginning programmers I have written what may as well pass as the first part of a Git tutorial. It's totally standalone, so it may be interesting outside the context of the book.

It's aimed at people who, of course, don't know Git and could use it as a local version control system. In the next chapter (being written) I cover things like remotes and push/pull.

So, if you want to read it: Git tutorial for people who don't know git (part I)

PS: If the diagrams are all black and white, reload the page. Yes, it's a JS issue. Yes, I know how to fix it.

I have written half a book

LIke mentioned before I am trying to write a book and ... well, I may be actually making progress? At least the generated PDF is about 170 pages long, which means I have written a bunch in this past month.

I have finished the second of four planned parts, which means I have done about half of it. Since I expect the next two parts to be shorter, it's actually more than that.

The target audience are people who have finished the python tutorial but are not exactly programmers yet. They have the syntax more or less in their heads, but how do you turn that into an actual piece of code?

  • Part 1 is about "prototyping", the process of dumping an idea into rough code.
  • Part 2 is about polishing that rough code into ... not so rough code. Includes a gentle introduction to testing, for example.
  • Part 3 (to be written) is about things that are not code:
    • Git / Gitlab
    • Issues
    • Packaging
    • Setting up a website
    • CI
    • Lots more
  • Part 4 is still to be thought but basically it will cover implementing a large feature from the ground up.

I much appreciate comments about it.

PD: Si, va a haber una traducciń al castellano. O mas bien al argentino. Una vez que lo termine.

I am trying to write a Python book

Once upon a time, I tried to write a book. It did not end well. I was trying to dump a whole lot of knowledge at once. Knowledge I did not really have, to be honest. When I look at that book I see a failed thing.

So, of course, many years later, I am trying again, but with the lessons learned in my mind.

  • It will be a smaller book.
  • I am not also writing a whole tool chain for it.
  • It will be about things I know.

So, what is it?

The temporary title, right now, is something like "Boxes: your second Python book". It says your second Python book because you do need a working knowledge of Python syntax as provided by the official Python Tutorial, but not much else. When there is a particularly hairy piece of code it may link to the tutorial or the reference or something.

The "idea" of the book is to bridge a gap that exists between knowing the basics of reading and writing a language (specially if it's your first!) and being able to effectively using it to create a useful project.

It follows the growth of "Boxes", a simplistic text layout engine, from a vague idea to a fully working, useful, tested, and published piece of software.

It's not there yet, but it's about 25% of the way there.

You can read it here: https://ralsina.gitlab.io/boxes-book/ and the sources are at https://gitlab.com/ralsina/boxes-book

Comments much appreciated!

Lois Lane, Reporting

So, 9 years ago I wrote a post about how I would love a tool that took a JSON data file, a Mako template, and generated a report using reStructured Text.

If you don't like that, pretend it says YAML, Jinja2 and Markdown. Anyway, same idea. Reports are not some crazy difficult thing, unless you have very demanding layout or need to add a ton of logic.

And hey, if you do need to add a ton of logic, you do know python, so how hard can it be to add the missing bits?

Well, not very hard. So here it is, 9 years later because I am sitting at an auditorium and the guy giving the talk is having computer problems.

Lois Lane Reports from PyPI. and GitHub

New mini-project: Gyro

History

Facubatista: ralsina, yo, vos, cerveza, un local-wiki-server-hecho-en-un-solo-.py-con-interfaz-web en tres horas, pensalo

Facubatista: ralsina, you, me, beer, a local-wiki-server-done-in-one-.py-with-web-interface in three hours, think about it

/images/gyro-1.thumbnail.png

The next day.

So, I could not get together with Facu, but I did sort of write it, and it's Gyro. [1]

Technical Details

Gyro has two parts: a very simple backend, implemented using Sanic [2] which does a few things:

  • Serve static files out of _static/
  • Serve templated markdown out of pages/
  • Save markdown to pages/
  • Keep an index of file contents updated in _static/index.js

The other part is a webpage, implemnted using Bootstrap [3] and JQuery [4]. That page can:

  • Show markdown, using Showdown [5]
  • Edit markdown, using SimpleMDE [6]
  • Search in your pages using Lunr [7]

And that's it. Open the site on any URL that doesn't start with _static and contains only letters and numbers:

  • http://localhost:8000/MyPage : GOOD
  • http://localhost:8000/MyDir/MyPage: BAD
  • http://localhost:8000/__foobar__: BAD

At first the page will be sort of empty, but if you edit it and save it, it won't be empty anymore. You can link to other pages (even ones you have not created) using the standard markdown syntax: [go to FooBar](FooBar)

There is really not much else to say about it, if you try it and find bugs, file an issue and as usual patches are welcome.


[1] Why Gyro? Gyros are delicious fast food. Wiki means quick. Also, I like Gyros. to check it out. So, since this was a toy project, why not?
[2] Why Sanic? Ever since Alejandro Lozanoff mentioned a flask-like framework done with the intention to be fast and async I wanted
[3] Why bootstrap? I know more or less what it does, and the resulting page is not totally horrible.
[4] Why JQuery? It's easy, small and I sort of know how it works.
[5] Why Showdown? It's sort of the standard to show markdown on the web.
[6] Why SimpleMDE? It looks and works awesome!
[7] Why Lunr? It works and is smaller than Tipue which is the only other similar thing I know.

How I Learned to Stop Worrying and Love JSON Schema

Intro

This post operates on a few shared assumptions. So, we need to explicitly state them, or otherwise you will read things that are more or less rational but they will appear to be garbage.

  • APIs are good
  • Many APIs are web APIs
  • Many web APIs consume and produce JSON
  • JSON is good
  • JSON is better if you know what will be in it

So, JSON Schema is a way to increase the number of times in your life that JSON is better in that way, therefore making you happier.

So, let's do a quick intro on JSON Schema. You can always read a much longer and surely better one from which I stole most examples at Understanding JSON Schema. later (or right now, it's your time, lady, I am not the boss of you).

Schemas

So, a JSON Schema describes your data. Here is the simplest schema, that matches anything:

{ }

Scary, uh? Here's a more restrictive one:

{
  "type": "string"
}

That means "a thing, which is a string." So this is valid: "foo" and this isn't 42 Usually, on APIs you exchange JSON objects (dictionaries for you pythonistas), so this is more like you will see in real life:

{
  "type": "object",
  "properties": {
    "street_address": { "type": "string" },
    "city": { "type": "string" },
    "state": { "type": "string" }
  },
  "required": ["street_address", "city", "state"]
}

That means "it's an object", that has inside it "street_address", "city" and "state", and they are all required.

Let's suppose that's all we need to know about schemas. Again, before you actually use them in anger you need to go and read Understanding JSON Schema. for now just assume there is a thing called a JSON Schema, and that it can be used to define what your data is supposed to look like, and that it's defined something like we saw here, in JSON. Cool?

Using schemas

Of course schemas are useless if you don't use them. You will use them as part of the "contract" your API promises to fulfill. So, now you need to validate things against it. For that, in python, we can use jsonschema

It's pretty simple! Here is a "full" example.

import jsonschema

schema = {
  "type": "object",
  "properties": {
    "street_address": {"type": "string"},
    "city": {"type": "string"},
    "state": {"type": "string"},
  },
  "required": ["street_address", "city", "state"]
}

jsonschema.validate({
    "street_address": "foo",
    "city": "bar",
    "state": "foobar"
}, schema)

If the data doesn't validate, jsonchema will raise an exception, like this:

>>> jsonschema.validate({
...     "street_address": "foo",
...     "city": "bar",
... }, schema)
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "jsonschema/validators.py", line 541, in validate
    cls(schema, *args, **kwargs).validate(instance)
  File "jsonschema/validators.py", line 130, in validate
    raise error
jsonschema.exceptions.ValidationError: 'state' is a required property

Failed validating 'required' in schema:
    {'properties': {'city': {'type': 'string'},
                    'state': {'type': 'string'},
                    'street_address': {'type': 'string'}},
     'required': ['street_address', 'city', 'state'],
     'type': 'object'}

On instance:
    {'city': 'bar', 'street_address': 'foo'}

Hey, that is a pretty nice description of what is wrong with that data. That is how you use a JSON schema. Now, where would you use it?

Getting value out of schemas

Schemas are useless if not used. They are worthless if you don't get value out of using them.

These are some ways they add value to your code:

  • You can use them in your web app endpoint, to validate things.
  • You can use them in your client code, to validate you are not sending garbage.
  • You can use a fuzzer to feed data that is technically valid to your endpoint, and make sure things don't explode in interesting ways.

But here is the most value you can extract of JSON schemas:

You can discuss the contract between components in unambiguous terms and enforce the contract once it's in place.

We are devs. We discuss via branches, and comments in code review. JSON Schema turns a vague argument about documentation into a fact-based discussion of data. And we are much, much better at doing the latter than we are at doing the former. Discuss the contracts.

Since the document describing (this part of) the contract is actually used as part of the API definitions in the code, that means the document can never be left behind. Every change in the code that changes the contract is obvious and requires an explicit renegotiation. You can't break API by accident, and you can't break API and hope nobody will notice. Enforce the contracts.

Finally, you can version the contract. Use that along with API versioning and voilá, you know how to manage change! Version your contracts.

  • Discuss your contracts
  • Enforce your contracts
  • Version your contracts

So now you can stop worrying and love JSON Schema as well.