Episodes
Detailed
Compact
Art
Reverse
August 23, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Chris #1: Why your mock doesn’t work Ned Batchelder TDD is an important practice for development, and as my team is finding out, mocking objects is not as easy at it seems at first. I love that Ned gives an overview of how Mock works But also gives two resources to show you alternatives to Mock, when you really don’t need it. From reading these articles and video, I’ve learned that it’s hard to make mocks but it’s important to: Create only one mock for each object you’re mocking that mocks only what you need have tests that run the mock against your code and your mock against the third party Mahmoud #2: Vermin By Morten Kristensen Rules-based Python version compatibility detector caniuse is cool, but it’s based on classifiers. When it comes to your own code, it’ll only tell you what you tell it. If you’ve got legacy libraries, or like most companies, an application, then you’ll need something more powerful. Vermin tells you the minimum compatible Python version, all the way down to the module and even function level. Brian #3: The nonlocal statement in Python Abhilash Raj When global is too big of a hammer. This doesn’t work: def function(): x = 100 def incr(y): x = x + y incr(100) This does: def function(): x = 100 def incr(y): nonlocal x x = x + y incr(100) print(x) Chris #4: twitter.com/brettsky/status/1163860672762933249 Brett Cannon Microsoft Azure improves python support 2 key points about the new Python support in Azure Functions: it's debuting w/ 3.6, but 3.7 support is actively being worked on and 3.8 support won't take nearly as long, and native async/await support! Mahmoud #5: Awesome Python Applications update Presented at PyBay 2019 Slides/summary (video forthcoming): http://sedimental.org/talks.html#ask-the-ecosystem-lessons-from-250-foss-python-applications 250+ applications, dating back to 1998 (mailman, gedit) 95% of applications have commits in 2019 65% of applications support Python 3 (even the ones with a long history!) Other interesting findings Presenting these findings and more at PyGotham 2019. NYC in early October. Brian #6: pre-commit now has a quick start guide Wanna use pre-commit but don’t know how to start? Here ya go! Runs through install configuration installing hooks running hooks against your project I’d like to add Add hooks to your project one at a time For each new hook add to pre-commit-config.yml run pre-commit install to install hook run pre-commit run --``all-files review changes made to your project if good, commit if bad revert modify config of tools, such as pyproject.toml for black, .flake8 for flake8, etc. try again Extras Chris: Humble Bundle by No Starch supports the Python Software Foundation https://codechalleng.es/ released Newbie Bites… challenges that are intended for people brand new to python. [[direct link](https://gumroad.com/l/Xhxeo)] Mahmoud: PyGotham 2019 October (Maintainers Conf in Washington DC, too) Real Python Pandas course Brian: http://py3readiness.org/ shows 360 of the top downloaded Python packages are all Python 3 ready. Jokes I was looking for some programming one liners online; looked on a reddit thread; read a great answer; which was “any joke can be a one-liner with enough semicolons.” A SQL statement walks into to a bar and up to two tables and asks, “Mind if I join you?”
August 14, 2019
Special guest: Kelly Schuster-Paredes Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Keynote: Python 2020 - Łukasz Langa - PyLondinium19 Enabling Python on new platforms is important. Python needs to expand further than just CPython. Web, 3D games, system orchestration, mobile, all have other languages that are more used. Perhaps it’s because the full Python language, like CPython in full is more than is needed, and a limited language is necessary. MicroPython and CircuitPython are successful. They are limited implementations of Python Łukasz talks about many parts of Python that could probably be trimmed to make targeted platforms very usable without losing too much. It’d be great if more projects tried to implement Python versions for other platforms, even if the Python implementation is limited. Kelly #2: Mu Editor by Nicholas Tollervey Lots of updates happening to the Code with Mu software Mu is a Python code editor for beginner programmers originally created as a contribution from the Python Software Foundation for the BBC’s micro:bit project Code with Mu presented at EuroPython and shared a lot of interesting updates and things in the alpha version of Mu, available on code with Mu website. Mu is a modal editor: BBC Microbit Circuit Python ESP Micropython Pygame Zero Python 3 Tiago Monte’s recorded presentation at EuroPython Game with Turtle Flask — release notes Made with Mu at EuroPython videos Hot off the press: Nick just released Pypercard a HyperCard inspired GUI framework for BEGINNER developers in Python based off of Adafruit’s release. It is a “PyperCard is a HyperCard inspired Pythonic and deliberately constrained GUI framework for beginner programmers. linked repos on GitHub. module re-uses the JSON specification used to create HyperCard The concept allows user to “create Hypercard like stacks of states” to allow beginner coders to create choose their own adventure games. Michael #3: Understanding the Python Traceback by Chad Hansen The Python traceback has a wealth of information that can help you diagnose and fix the reason for the exception being raised in your code. What do we learn right away? The type of error A description of the error (hopefully, sometimes) The line of code the error occurred on The call stack (filenames, line numbers, and module names) If the error happened while handling another error Read from bottom to top — that was weird to me Most common error? AttributeError: 'NoneType' object has no attribute 'an_attribute' Article talks about other common errors Are you creating custom exceptions to make your packages more useful? Brian #4: My oh my, flake8-mypy and pytest-mypy contributed by Ray Cote via email “For some reason, I continually have problems running mypy, getting it to look at the correct paths, etc. However, when I run it from flake8-mypy, I'm getting reasonable, actionable output that is helping me slowly type hint my code (and shake out a few bugs in the process). There's also a pytest-mypy, which I've not yet tried. “ - Ray flake8-mypy ** Maintained by Łukasz Langa “The idea is to enable limited type checking as a linter inside editors and other tools that already support Flake8 warning syntax and config.” pytest-mypy Maintained by Dan Bader and David Tucker “Runs the mypy static type checker on your source files as part of your pytest test runs.” Remind me to do a PR against the README to make pytest lowercase. Kelly #5: Lego Education and Spike In March of this year, Lego Education gave news of a new robot being released since the EV3 released of Mindstorms in 2013. Currently the EV3 Mindstorm can be coded with Python and it is assumed that Spike Prime can be as well. The current EV3 robots can currently be coded in python thanks to Nigel Ward. He created a site back in 2016 or earlier; through a program called the EV3Dev project. ev3dev is a Debian Linux-based operating system Until recently, Lego had not endorsed the use of Python or had they released documentation. Lego released a Getting started with EV3 MicroPython 59 page guide Version 1.0.0 EV3 MicroPython runs on top of ev3dev with a new Pybricks MicroPython runtime and library. has its own Visual Studio Code extension no need for terminal Has instruction and lists of different features and classes used to program the PyBricks API- A python wrapper for the Databricks Rest API. Pybricks is on GitHub from one contributor, Sebastien Thomas under MIT license David Lechner, Laurens Valk, and Anton Vanhoucke are contributors of the Lego MicroPython release. This opens up opportunities for students that compete in the First Lego League Competition to code in Python. Example code for the Gyrobot Michael #6: Python 3 at Mozilla From January 2019. Mozilla uses a lot of Python. In mozilla-central there are over 3500 Python files (excluding third party files), comprising roughly 230k lines of code. Additionally there are 462 repositories labelled with Python in the Mozilla org on Github That’s a lot of Python, and most of it is Python 2. But before tackling those questions, I want to address another one that often comes up right off the bat: Do we need to be 100% migrated by Python 2’s EOL? No. But punting the migration into the indefinite future would be a big mistake: Python 2 will no longer receive security fixes. All of the third party packages we rely on (and there are a lot of them) will also stop being supported Delaying means more code to migrate Opportunity cost: Python 3 was first released in 2008 and in that time there have been a huge number of features and improvements that are not available in Python 2. The best time to get serious about migrating to Python 3 was five years ago. The second best time is now. Moving to Python 3 We stood up some linters. One linter that makes sure Python files can at least get imported in Python 3 without failing One that makes sure Python 2 files use appropriate __future__ statements to make migrating that file slightly easier in the future. Pipenv & poetry & Jetty: a little experiment I’ve been building. It is a very thin wrapper around Poetry Extras Brian: Python 3.8.0b3 “We strongly encourage maintainers of third-party Python projects to test with 3.8 during the beta phase and report issues …” Michael: pipx now has shell completions Kelly: Teaching Python podcast Jokes via Real Python and Nick Spirit Python private method → Joke cartoon image.
August 6, 2019
Special guest: Brett Thomas Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Writing sustainable Python scripts Vincent Bernat Turning a quick Python script into a maintainable bit of software. Topics covered: Documentation as a docstring helps future users/maintainers know what problem you are solving. CLI arguments with defaults instead of hardcoded values help extend the usability of the script. Logging. Including debug logging (and how to turn them on with CLI arguments), and system logging for unattended scripts. Tests. Simple doctests, and pytest tests utilizing parametrize to have one test and many test cases. Brett #2: Static Analysis and Bandit Michael #3: jupyter-black Black formatter for Jupyter Notebook One of the big gripes I have about these online editors is their formatting (often entirely absent) Then the extension provides a toolbar button a keyboard shortcut for reformatting the current code-cell (default: Ctrl-B) a keyboard shortcut for reformatting whole code-cells (default: Ctrl-Shift-B) Brian #4: Report Generation workflow with papermill, jupyter, rclone, nbconvert, … Chris Moffitt articles Automated Report Generation with Papermill: Part 1 Automated Report Generation with Papermill: Part 2 Jupyter Notebooks used to create a report with pandas and matplotlib nbconvert to create an html report Papermill to parametrize the process with different data, and execute the notebook Copy the reports to shared cloud folders using Rclone. Set up a process to automate everything. Hook it up to cron to run regularly Brett #5: Rant on time deltas datetime.timedelta(months=1) # Boom, too bad. Use: https://dateutil.readthedocs.io/en/stable/ Michael #6: How — and why — you should use Python Generators by Radu Raicea Generator functions allow you to declare a function that behaves like an iterator. They allow programmers to make an iterator in a fast, easy, and clean way. They only compute it when you ask for it. This is known as lazy evaluation. If you’re not using generators, you’re missing a powerful feature Often they result in simpler code than with lists and standard functions Extras Brian: PyPI now supports uploading via API token also on Test PyPI Michael: Chocolatey package manager on windows via Prayson Daniel GvM’s Next PEG article Jokes A good programmer is someone who always looks both ways before crossing a one-way street. (reminds me of another joke: Adulthood is like looking both ways before crossing the street, then getting hit by an airplane) Little bobby tables
July 29, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Debugging with f-strings in Python 3.8 We’ve talked about the walrus operator, :=, but not yet “debug support for f-strings” this: print(f'foo={foo} bar={bar}') can change to this: print(f'{foo=} {bar=}') and if you don’t want to print with repr() you can have str() be used with !s. print(f'{foo=!s} {bar=!s}') also !f can be used for float modifiers: >>> import math >>> print(f'{math.pi=!f:.2f}') math.pi=3.14 one more feature, space preservation in the f-string expressions: >>> a = 37 >>> print(f'{a = }, {a = }') a = 37, a = 37 Michael #2: Am I "real" software developer yet? by Sun-Li Beatteay To new programmers joining the field, especially those without CS degrees, it can feel like the title is safe-guarded. Only bestowed on the select that have proven themselves. Sometimes manifests itself as Impostor Syndrome Focused on front-end development as I had heard that HTML, CSS and JavaScript were easy to pick up That was when I decided to create a portfolio site for my wife, who was a product designer. Did my best to surround myself with tech culture. Watched YouTube videos listened to podcasts read blog posts from experienced engineers to keep myself motivated. Daydreamed what it would be like to stand in their shoes. My wife’s website went live in July of that year. I had done it. Could I finally start calling myself something of a Software Engineer? “Web development isn’t real programming” Spent the next 18 months studying software development full time. I quit my job and moved in with my in-laws — which was a journey in-and-of itself. Software engineer after 1-2 years? No so fast (says the internet) The solution that I found for myself was simple yet terrifying: talking to people MK: BTW, I don’t really like the term “engineer” Brian #3: Debugging with local variables and snoop debugging tools ex: “You want to know which lines are running and which aren't, and what the values of the local variables are.” Throw a @snoop decorator on a function and the function lines and local variable values will be dumped to stderr during run. Even showing loops a bunch of times. It’s tools to almost debug as if you had a debugger, without a debugger, and without having to add a bunch of logging or print statements. Lots of other use models to allow more focus. wrap just part of your function with a with snoop block only watch certain local variables. turn off reporting for deep function/block levels. Michael #4: New home for Humans This came out of the blue with some trepidation: kennethreitz commented 6 days ago: In the spirit of transparency, I'd like to (publicly) find a new home for my repositories. I want to be able to still make contributions to them, but no longer be considered the "owner" or "arbiter" or "BDFL" of these repositories. Some notable repos: https://github.com/kennethreitz/requests https://github.com/kennethreitz/records https://github.com/kennethreitz/requests-html https://github.com/kennethreitz/setup.py https://github.com/kennethreitz/legit https://github.com/kennethreitz/responder Lots of back and forth until Ernest jumped in. The Python Software Foundation would like to offer to accept transfers of these repositories into the @psf GitHub organization. This organization was recently acquired by the Python Software Foundation and intended to provide administrative backstopping for projects in the ecosystem; existing maintainers of various projects will remain and the PSF staff will be available to manage repositories and teams as necessary. Brian #5: The Backwards Commercial License Eran Hammer - open source dev, including hapi.js Interesting idea to make open source projects maintainable Three phases of software lifecycle for some projects: first: project created to fill a need in one project/team/company, a single use case second: used by many, active community, growing audience three: work feels finished. bug fixes, security issues, minor features continue, but most people can stay on old stable versions During the “done” phase, companies would like to have bug fixes but don’t want to have to keep changing their code to keep up. Idea: commercial license to support old stable versions. “If you keep up with the latest version, you do not require a license (unless you want the additional benefits it will provide).” “However, very few companies can quickly migrate every time there is a new major release of a core component. Engineering resources are limited and in most cases, are better directed at building great products than upgrading supporting infrastructure. The backwards license provides this exact assurance. You can stay on any version you would like knowing that you are still running supported, well-maintained, and secure code.” “The new commercial license will include additional benefits focused on providing enterprise customers the assurances needed to rely on these critical components for many years to come. “ Michael #6: Switching Python Parsers? via Gi Bi, article by Guido van Rossum Alternative to the home-grown parser generator that I developed 30 years ago when I started working on Python. (That parser generator, dubbed “pgen”, was just about the first piece of code I wrote for Python.) Here are some of the issues with pgen that annoy me. The “1” in the LL(1) moniker implies that it uses only a single token lookahead, and this limits our ability of writing nice grammar rules. Because of the single-token lookahead, the parser cannot determine whether it is looking at the start of an expression or an assignment. So how does a PEG parser solve these annoyances? By using an infinite lookahead buffer! The typical implementation of a PEG parser uses something called “packrat parsing”, which not only loads the entire program in memory before parsing it, but also allows the parser to backtrack arbitrarily. Why not sooner? Memory! But that is much less of an issue now. My idea now, putting these things together, is to see if we can create a new parser for CPython that uses PEG and packrat parsing to construct the AST directly during parsing, thereby skipping the intermediate parse tree construction, possibly saving memory despite using an infinite lookahead buffer Extras Brian: Plone 5.2 https://plone.org/news/2019/plone-5-2-the-future-proofing-release Plone is a content management system built on top of Zope, a web application server framework. Plone 5.2 supports Python 3.6, 3.7, 3.8 uses Zope 4, which also support Python 3 Multi-year effort Interview with Philip Bauer, organizer of 5.2. Michael: Building Dab and T-Pose Controlled Lights - Make Art with Python Jokes A couple of quick ones: “What is a whale’s favorite language?” “C” — via Eric Nelson Why does Pythons live on land? Because it is above C-level! — via Jesper Kjær Sørensen @JKSlonester
July 23, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Becoming a 10x Developer : 10 ways to be a better teammate Kate Heddleston “A 10x engineer isn’t someone who is 10x better than those around them, but someone who makes those around them 10x better.” Create an environment of psychological safety Encourage everyone to participate equally Assign credit accurately and generously Amplify unheard voices in meetings Give constructive, actionable feedback and avoid personal criticism Hold yourself and others accountable Cultivate excellence in an area that is valuable to the team Educate yourself about diversity, inclusivity, and equality in the workplace Maintain a growth mindset Advocate for company policies that increase workplace equality article includes lots of actionable advice on how to put these into practice. examples: Ask people their opinions in meetings. Notice when someone else might be dominating a conversation and make room for others to speak. Michael #2: quasar & vue.py via Doug Farrell Quasar is a Vue.js based framework, which allows you as a web developer to quickly create responsive++ websites/apps in many flavours: SPAs (Single Page App) SSR (Server-side Rendered App) (+ optional PWA client takeover) PWAs (Progressive Web App) Mobile Apps (Android, iOS, …) through Apache Cordova Multi-platform Desktop Apps (using Electron) Great for python backends tons of vue components But could it be all python? vue.py provides Python bindings for Vue.js. It uses brython to run Python in the browser. Examples can be found here. Brian #3: Regular Expressions 101 We talked about regular expressions in episode 138 Some tools shared with me after I shared a regex joke on twitter, including this one. build expressions for Python and also PHP, JavaScript, and Go put in an example, and build the regex to match explanations included match information including match groups and multiple matches quick reference of all the special characters and what they mean generates code for you to see how to use it in Python Also fun (and shared from twitter): Regex Golf see how far you can get matching strings on the left but not the list on the right. I got 3 in and got stuck. seems I need to practice some more Michael #4: python-diskcache Caching can be HUGE for perf benefits But memory can be an issue Persistence across executions (e.g. web app redeploy) an issue Servers can be issues themselves Enter the disk! Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python. DigitalOcean and many hosts now offer SSD’s be default Unfortunately the file-based cache in Django is essentially broken. DiskCache efficiently makes gigabytes of storage space available for caching. By leveraging rock-solid database libraries and memory-mapped files, cache performance can match and exceed industry-standard solutions. There's no need for a C compiler or running another process. Performance is a feature Testing has 100% coverage with unit tests and hours of stress. Nice comparison chart Brian #5: The Python Help System Overview of the built in Python help system, help() examples to try in a repl help(print) help(dict) help('assert') import math; help(math.log) Also returns docstrings from your non-built-in stuff, like your own methods. Michael #6: Python Architecture Graphs by David Seddon Impulse - a CLI which allows you to quickly see a picture of the import graph any installed Python package at any level within the package. Useful to run on an unfamiliar part of a code base, to help get a quick idea of the structure. It's a visual explorer to give you a quick signal on architecture. Import Linter - this allows you to declare and check contracts about your dependency graph, which gives you the ability to lint your code base against architectural rules. Helpful to enforce certain architectural constraints and prevent circular dependencies creeping in. Extras Michael: tabnanny flask course is out, give it a look Jokes Two threads walk into a bar. The barkeeper looks up and yells, 'Hey, I want don't any conditions race like time last!’ A string value walked into a bar, and then was sent to stdout.
July 18, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Ines Montani Brian #1: Simplify Your Python Developer Environment Contributed by Nils de Bruin “Three tools (pyenv, pipx, pipenv) make for smooth, isolated, reproducible Python developer and production environments.” The tools: pyenv - install and manage multiple Python versions and flavors pipx - install a Python application with it’s own virtual environment for use globally pipenv - managing virtual environments, dependencies, on a per project basis Brian note: I’m not sold on any of these yet, but honestly haven’t given them a fair shake either, but also didn’t really know how to try them all out. This is a really good write up to get started. Ines #2: New fast.ai course: A Code-First Introduction to Natural Language Processing fast.ai is a really popular, free course for deep learning by Rachel Thomas and Jeremy Howard Also comes with a Python library and lots of notebooks Some influential research developed alongside the course, e.g. ULMFiT (popular algorithm for NLP tasks like text classification) New course on Natural Language Processing: Practical introduction to NLP covering both modern neural network approaches and traditional techniques Highlights: NLP background: topic modeling and linear models Rule-based approaches and real-world problem solving Focus on ethics – videos on bias and disinformation Michael #3: Cloning the human voice In 5 minutes, with Python via Brenden Clone a voice in 5 seconds to generate arbitrary speech in real-time An implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. Watch the video: https://www.youtube.com/watch?v=-O_hYhToKoA Also: Fake voices 'help cyber-crooks steal cash’ Brian #4: Ab(using) pyproject.toml and stuffing pytest.ini and mypy.ini content into it Contributed by Andrew Spittlemeister My first reaction is horror, but this is kinda my thought process with this one toml is not ini (but they look close) neither pytest nor mypy support storing configuration in pyproject.toml they both do support using setup.cfg (but flit and poetry projects don’t use that file, or try not to) they both support passing in the config file as a command line argument you can be careful and write a pyproject.toml file that is both toml and ini compliant drat, this is a reasonable idea, if not a little wacky no guarantee that it will keep working one thing to note: use quotes for stuff you normally wouldn’t need to in ini file. Example ini: [pytest] addopts = -ra -v if stuffed in pyproject.toml [pytest] addopts = "-ra -v" to run: > mypy --config-file pyproject.toml module_name > pytest -c pyproject.toml Ines #5: *Polyaxon* A platform for reproducing and managing the whole life cycle of machine learning and deep learning applications. We talked to lots of research groups and everyone works with just their GPU on desktop. Super slow – you need to wait for results, schedule next job etc. Polyaxon is a free open source library built on Kubernetes. Really easy to set up, especially on Google Kubernetes Engine. Especially good for hyper-parameter search, where you might not need GPU experiments if you can run lots of experiments in parallel Release v0.5 just came today. Big improvements: Plugins system Local runs, for much easier debugging New workflow engine for chaining things together and run experiments with lots of steps Michael #6: Flynt for f-strings A tool to automatically convert old string literal formatting to f-strings F-Strings: Not only are they more readable, more concise, and less prone to error than other ways of formatting, but they are also faster! Converted over 500 lines / expressions in Talk Python Training and Python Bytes. Get started with a pipx install: pipx install flynt Then point it at A file: flynt somefile.py A directory (recursively): flynt ./ Converts code like this: print(``"``Greetings {}, you have found {:,} items!``"``.format(name, count)) To code like this: print(f"Greetings {name}, you have found {count:,} items!") Beware of the digit grouping bug. Good project to jumping in and contributing to open source Extras: Thanks to André Jaenisch for pointing the existence of ReDoS attacks and a good video explaining them. Michael: Python httptoolkit Python Magic’s name via David Martínez Flying Fractals (video and code) Python 3.7.4 is out Ines: Explosion (?) spaCy IRL 2019 our very first conference held on July 6 in Berlin many amazing speakers from research, applied NLP and the community all talks were recorded and will be up on our YouTube channel very soon FastAPI core developer Sebastián Ramírez is joining our team FastAPI was presented by Brian in episode 123 of this podcast we’re big fans and have been switching all our APIs over to FastAPI we’ll keep supporting the project and will definitely give Sebastián enough time to keep working on it Joke: A programmer walks into a bar and orders 1.38 root beers. The bartender informs her it's a root beer float. She says 'Make it a double!’ What do you call a developer without a side project? Well rested.
July 8, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: flake8-comprehensions submitted by Florian Dahlitz I’m already using flake8, so adding this plugin is a nice idea. checks your code for some generator and comprehension questionable code. C400 Unnecessary generator - rewrite as a list comprehension. C401 Unnecessary generator - rewrite as a set comprehension. C402 Unnecessary generator - rewrite as a dict comprehension. C403 Unnecessary list comprehension - rewrite as a set comprehension. C404 Unnecessary list comprehension - rewrite as a dict comprehension. C405 Unnecessary (list/tuple) literal - rewrite as a set literal. C406 Unnecessary (list/tuple) literal - rewrite as a dict literal. C407 Unnecessary list comprehension - '[HTML_REMOVED]' can take a generator. C408 Unnecessary (dict/list/tuple) call - rewrite as a literal. C409 Unnecessary (list/tuple) passed to tuple() - (remove the outer call to tuple()/rewrite as a tuple literal). C410 Unnecessary (list/tuple) passed to list() - (remove the outer call to list()/rewrite as a list literal). C411 Unnecessary list call - remove the outer call to list(). Example: Rewrite list(f(x) for x in foo) as [f(x) for x in foo] Rewrite set(f(x) for x in foo) as {f(x) for x in foo} Rewrite dict((x, f(x)) for x in foo) as {x: f(x) for x in foo} Michael #2: PyOxidizer (again) Michael’s assessment - There are three large and looming threats to Python. Lack of A real mobile development story GUI applications on desktop operating systems Sharing your application with users (this is VERY far from deployment to servers) Cover PyOxidizer before but seems to have just rocketed off last couple of weeks. At their PyCon 2019 keynote talk, Russel Keith-Magee identified code distribution as a potential black swan - an existential threat for longevity - for Python. “Python hasn't ever had a consistent story for how I give my code to someone else, especially if that someone else isn't a developer and just wants to use my application.” They announced the first release of PyOxidizer (project, documentation), an open source utility that aims to solve the Python application distribution problem! PyOxidizer's marquee feature is that it can produce a single file executable containing a fully-featured Python interpreter, its extensions, standard library, and your application's modules and resources. You can have a single .exe providing your application. Unlike other tools in this space which tend to be operating system specific, PyOxidizer works across platforms (currently Windows, macOS, and Linux - the most popular platforms for Python today). PyOxidizer loads everything from memory and there is no explicit I/O being performed. When you **import** a Python module, the bytecode for that module is being loaded from a memory address in the executable using zero-copy. This makes PyOxidizer executables faster to start and import - faster than a python executable itself! Brian #3: Using changedir to avoid the need for src I’ve been experimenting with combining flit, pytest, tox, and coverage for new projects. And in doing so, ran across a cool feature of tox that I didn’t know about before, changedir. It’s a feature of tox to allow you to run tests in a different directory than the top level project directory. tox changedir docs tox and pytest and changedir I talk about this more in episode 80 of Test & Code. As an example project I build yet another markdown converter using regular expressions. This is funny to me, considering the recent cloudflare outage due to a single regular expression. https://blog.cloudflare.com/cloudflare-outage/ “Tragedy is what happens to me, comedy is what happens to you” - Mel Brooks approximate quote. Michael #4: WebRTC and ORTC implementation for Python using asyncio Web Real-Time Communication (WebRTC) - WebRTC is a free, open project that provides browsers and mobile applications with Real-Time Communications (RTC) capabilities via simple APIs. Object Real-Time Communication (ORTC) - ORTC (Object Real-Time Communications) is an API allowing developers to build next generation real-time communication applications for web, mobile, or server environments. The API closely follows its Javascript counterpart while using pythonic constructs: promises are replaced by coroutines events are emitted using pyee.EventEmitter The main WebRTC and ORTC implementations are either built into web browsers, or come in the form of native code. In contrast, the aiortc implementation is fairly simple and readable. Good starting point for programmers wishing to understand how WebRTC works or tinker with its internals. Easy to create innovative products by leveraging the extensive modules available in the Python ecosystem. For instance you can build a full server handling both signaling and data channels or apply computer vision algorithms to video frames using OpenCV. Brian #5: Apprise - Push Notifications that work with just about every platform! listener suggestion cool shim project to allow multiple notification services in one app “Apprise allows you to send a notification to almost all of the most popular notification services available to us today such as: Telegram, Pushbullet, Slack, Twitter, etc. One notification library to rule them all. A common and intuitive notification syntax. Supports the handling of images (to the notification services that will accept them).” supports notification services such as discord, gitter, ifttt, mailgun, mattermost, MS teams, twitter, … SMS notification through Twilio, Nexmo, AWS, D7 email notifications Michael #6: Websauna web framework Websauna is a full stack Python web framework for building web services and back offices with admin interface and sign up process https://websauna.org "We have web applications 80% figured out. Websauna takes it up to 95%.” Built upon Python 3, Pyramid, and SQLAlchemy. When to use it? Websauna is focused on Internet facing sites where you have a public or private sign up process and an administrative interface. Its sweet spots include custom business portals and software-as-a-service products which are too specialized for off-the-shelf solutions. Benefits Focus on core business logic as Websauna provides basic website building blocks like sign up and sign in. Low learning curve and friendly comprehensive documentation help novice developers Emphasis is on meeting business requirements with reliable delivery times, responsiveness, consistency Site operations is half the story. Websauna provides an automated deployment process and integrates with monitoring, security and other DevOps solutions. Extras Michael: Data driven Flask course is out! Brian: Recent Test & Code episodes were solo because I’m in the middle of a work move and didn’t want to schedule interviews around a crazy work schedule. However, that should settle down in July and I can get back to getting great guests on the show. But I’m also having fun with solo topics, so I’ll keep that in the mix. upshot: if I’ve contacted you or you me about being on the show and you haven’t heard from me lately, give me a nudge with a DM or email or something. Jokes An SQL query goes into a bar, walks up to two tables and asks, 'Can I join you?' Not a joke, really, but along the lines of “comedy when it happens to you”. Reset procedure for GE lightbulbs theregister.co.uk/2019/06/20/ge_lightblulb_reset
July 2, 2019
Sponsored by Rollbar: https://pythonbytes.fm/rollbar Brian #1: Comparing the Same Project in Rust, Haskell, C++, Python, Scala and OCaml Tristan Hume, writing about a university project Teams of up to 3 people, multi month, write a Java to x86 compiler in language of choice Needed to pass both known and unknown tests. Secret tests to be run after submission encouraged teams to add more testing than provided. Nothing but standard libraries, and no parsing libraries, even if in standard. Lines of code Rust baseline Haskell: 1-1.6x C++: 1.4x Rust (another team): 3x Scala: 0.7 x OCaml: 1-1.6x Python: about half the size Python version one person used metaprogramming more extra features than any other team passed all public and secret tests Michael #2 : Pylustrator is a program to style your matplotlib plots via Len Wanger Pylustrator is a program to style your matplotlib plots for publication. Subplots can be resized and dragged around by the mouse, text and annotations can be added. Changes can be saved to the initial plot file as python code. Brian #3: MongoDB 4.2 Distributed Transactions extends multi-document ACID transactions across documents, collections, dbs in a replica set, and sharded cluster. Field Level Encryption encryption done on client side satisfies GDPR by allowing customer key destruction rendering server data on customer useless. system administration can be done with no exposure to private data Michael #4: Deep Difference and search of any Python object/data via François Leblanc DeepDiff: Deep Difference of dictionaries, iterables, strings and other objects. It will recursively look for all the changes. Lots of nice touches: List difference ignoring order or duplicates Report repetitions Exclude certain types from comparison Exclude part of your object tree from comparison Significant Digits DeepSearch: Search for objects within other objects. DeepHash: Hash of ANY python object based on its contents even if the object is not considered hashable! DeepHash is supposed to be deterministic in order to make sure 2 objects that contain the same data, produce the same hash. Brian #5: Advanced Python Testing Josh Peak “This article is mostly for me to process my thoughts but also to pave a path for anyone that wants to follow a similar journey on some more advanced python testing topics.” Learning journey (including some great podcasts and an awesome book on testing) Testing tools basic test structure adding black to testing with pytest-black linting with pylint including a very cool speed up trick to only lint modified files. flake8, including docstring checking tox.ini modifications code coverage goals and how to ratchet up to that goal with --cov-fail-under cool learning: “Increase code coverage by testing more code OR deleting code.” fixtures for database connections utilizing mocks, spies, stubs, and monkey patches, including pytest-mock pytest-vcr to save network interactions and replay them in future test runs, resulting in a 10x speedup. Lots of links and tangents possible from this article. Michael #6: Understanding Python's del via Kevin Buchs Official docs General confusion of what this does Looks like memory management, and it mostly isn’t Primary use: remove an item from a list given its index instead of its value or from a dictionary given its key: del person['profession'] # person is a dict del statement can also be used to remove slices from a list del lst[2:4] del can also be used to delete entire variables: del variable Recently covered how The CPython Bytecode Compiler is Dumb. Proactive dels could help. Extras Michael: Pynsource: Reverse engineer Python source code into UML diagrams (via Anders Klint) Language Bar chart race (via Josh Thurston) My Local maximum appearance. Jokes Optimist: The glass is half full. Pessimist: The glass is half empty. Programmer: The glass is twice as large as necessary. Pragmatist: allowing room for requirements oversights, scope creep, and schedule overrun. From “The Upside” with Kevin Hart and Bryan Cranston (watched it last night): K: Would you invest in [HTML_REMOVED]? B: That seems too niche. K: What’s “niche” mean? B: It’s the girl version of “nephew”.
June 25, 2019
Brought to you by Datadog: pythonbytes.fm/datadog Brian #1: Voilà! “from Jupyter notebooks to standalone applications and dashboards” Turn a notebook into a web app with: custom widgets runnable code (but not editable) interactive plots different custom grid layouts templates Michael #2: Toward a “Kernel Python” By Glyph Glyph wants to Marie Kondō the standard library (and I think I agree with him) We have PEP 594 for removing obviously obsolete and unmaintained detritus from the standard library. PEP 594 is great news for Python, and in particular for the maintainers of its standard library, who can now address a reduced surface area. Believes the PEP may be approaching the problem from the wrong direction. One “dead” battery is the colorsys module: why not remove it? “The module is useful to convert CSS colors between coordinate systems. Today, however, the modules you need to convert colors between coordinate systems are only a pip install away. Every little bit is overhead for the core devs, consider the state of PRs Looking at CPython’s keyword-based review queue, we can see that there are 429 tickets currently awaiting review. The oldest PR awaiting review hasn’t been touched since February 2, 2018, which is almost 500 days old. By Glyph’s subjective assessment, on this page of 25 PRs, 14 were about the standard library, 10 were about the core language or interpreter code We need a “kernel” version of Python that contains only the most absolutely minimal library, so that all implementations can agree on a core baseline that gives you a “python” Michael: There will be a cost to beginners. But there is already. Brian #3: Use __main__.py I didn’t know it was that easy to get python -m [HTML_REMOVED] to work. Michael #4: The CPython Bytecode Compiler is Dumb by Chris Wellons Given multiple ways to express the same algorithm or idea, Chris tends to prefer the one that compiles to the more efficient bytecode. Fortunately CPython, the main and most widely used implementation of Python, is very transparent about its bytecode. It’s easy to inspect and reason about its bytecode. The disassembly listing is easy to read and understand. One fact has become quite apparent: the CPython bytecode compiler is pretty dumb. With a few exceptions, it’s a very literal translation of a Python program, and there is almost no optimization. Darius Bacon points out that Guido van Rossum himself said, “Python is about having the simplest, dumbest compiler imaginable.” So this is all very much by design. The consensus seems to be that if you want or need better performance, use something other than Python. (And if you can’t do that, at least use PyPy.) ← Cython people, Cython. Example def foo(): x = 0 y = 1 return x Could easily be: def foo(): return 0 Yet, CPython completely misses this optimization for both x and y: 2 0 LOAD_CONST 1 (0) 2 STORE_FAST 0 (x) 3 4 LOAD_CONST 2 (1) 6 STORE_FAST 1 (y) 4 8 LOAD_FAST 0 (x) 10 RETURN_VALUE And so on. Brett Cannot has expressed performance as a major focus for CPython, maybe there is something here? Brian #5: You can play with EdgeDB now, maybe A Path to a 10x Database EdgeDB roadmap Alpha 1 is available. “EdgeDB is the next generation relational database based on PostgreSQL. It features a novel data model and an advanced query language.” I’m excited about what their doing. Looking forward to 1.0. Lots of great features listed in the 10x post, but what I’m most intrigued by is their replacement of SQL with a different query language. Michael #6: 16 Python libraries that helped a healthcare startup grow via Waqas Younas Worked with a U.S.-based healthcare startup for 7 years. This startup developed a software product that sent appointment reminders to the patients of healthcare facilities; the reminders were sent via email, text, and IVR. Paramiko - A Python implementation of SSHv2. built-in CSV module SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper Requests - HTTP for Humans™ BeautifulSoup - Python library for pulling data out of HTML and XML files. testscenarios - a pyunit extension for dependency injection HL7 - a simple library for parsing messages of Health Level 7 (HL7) version 2.x into Python objects. Python-Phonenumbers - Library for parsing, formatting, and validating international phone numbers gevent - a coroutine -based Python networking library that uses greenlet to provide a high-level synchronous API on top of the libev or libuv event loop. dateutil - powerful extensions to datetime (pip install python-dateutil) Matplotlib - a Python 2D plotting library which produces publication quality figures python-magic - a python interface to the libmagic file type identification library. libmagic identifies file types by checking their headers according to a predefined list of file types. Django - a high-level Python Web framework that encourages rapid development and clean, pragmatic design Boto - a Python package that provides interfaces to Amazon Web Services. Mailgun Python bindings - helped us send appointment reminders seamlessly Twilio’s Python bindings - helped us send appointment reminders seamlessly Extras Michael: United States Digital Service Jokes Difference between ML & AI? Ans.
June 20, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest Max Sklar Brian #1: Why do Python lists let you += a tuple, when you can’t + a tuple? Reuven Lerner >>> x = [1, 2, 3] >>> b = (4, 5, 6) >>> x + b Traceback (most recent call last): File "[HTML_REMOVED]", line 1, in [HTML_REMOVED] TypeError: can only concatenate list (not "tuple") to list >>> x += b >>> x [1, 2, 3, 4, 5, 6] Huh?? “It turns out that the implementation of list.__iadd__ (in place add) takes the second (right-hand side) argument and adds it, one element at a time, to the list. It does this internally, so that you don’t need to execute any assignment after. The second argument to “+=” must be iterable.” Max #2: R vs Python, R is out of top 20 languages despite statistical boom Subtitle: is R declining because of Python? First of all, this article is about an index on the popularity of programming languages from an organization TIOBE. They have an index on the popularity of programming languages. Obviously it’s a combination of many different scores, and that could be controversial, but I’m going to assume that they put some thought into how the rankings are calculated, and that it’s as good as any. A few stories here: first Python hit at all time high in their ranking at number 3, beating out c++ I believe for the first time, and only Java and C are above it. The other story is that the statistical language R dipped below 20 to number 21, and the speculation is that Python has sort of taken over as the preferred statistical language to R. Personally, I got into Python much sooner, because I started as a software engineer, and moved into data science and machine learning. So after taking CS, and programming in Java and C for a few years, python came much more naturally. But still - a lot of people who are data-science first (and they have an additional skills to the kind of hybrid that I am) like and prefer R, and they can use it in a specialized way and get good results. Personally, I’m going to stick with python, because there’s so many statistical libraries yet to learn, and it’s served me well thus far. The language I’ve used most in recent years, Scala, is surprisingly down at 31 - not even close! related: https://www.zdnet.com/article/programming-languages-python-predicted-to-overtake-c-and-java-in-next-4-years/ Michael #3: macOS deprecates Python 2, will stop shipping it (eventually) via Dan Bader, on the heels of WWDC 2019 “Future versions of macOS won’t include scripting language runtimes by default” Contrast this with Windows just now starting to ship with Python 3 In the same announcement: “Use of Python 2.7 isn’t recommended as this version is included in macOS for compatibility with legacy software. Future versions of macOS won’t include Python 2.7. Instead, it’s recommended that you run python3 from within Terminal. (51097165)” Also has impact wider than “us”. E.g. No Ruby or Perl, means home brew doesn’t install easily which is how we get Python 3! Brian #4: Pythonic Ways to Use Dictionaries Al Sweigart A few pythonic uses of dictionaries that are not obvious to new people. Use get() and setdefault() with Dictionaries get(key, default=[HTML_REMOVED]) allows you to read a key without checking for it’s existence beforehand. setdefault(key, default=[HTML_REMOVED]) is a bit of a strange duck but still useful. Set the value of something if it doesn’t exist yet. Python Uses Dictionaries Instead of a Switch Statement Just do it a few times to get the hang of it. Then it becomes natural. Michael's switch addition for Python: https://github.com/mikeckennedy/python-switch Max #5: Things you are probably not using in Python 3 But Should This is from Datawhatnow.com This is particularly relevant for me, since I used python legacy at Foursquare for many years, and now coming back to it taking another look at python v3. One that looks very useful is f-Strings where you can put the variable name in braces in a string and just have it replaced. I’ve seen things like this in other languages - notably PHP and most front-end scripts. Makes the code very readable. Except I know I’m going to screw up by leaving out that stray “f” in front of the string. It should almost be automatic, because how often are you putting these variable names in braces? Another thing I didn’t know python 3 had - again I’m kind of just get started with python 3 is enumerations. I’ve been using Enums for years in scala (really case classes) to make my code WAY more readable. Will keep that in mind when developing in python 3. Michael #6: Have a time machine? C++ would get the Python 2 → 3 treatment too via James Small In a recent CppCast interview, Herb Sutter describes how he would change C/C++ types if he could go back in time. This is almost exactly how things were changed from Python 2 to Python 3 (str split into Unicode strings and byte arrays) So my question to you two is: Why was the transition so hard? Was it just habit and stubbornness? What could the PSF have done? Extras Michael: pip install mystery by Divo Kaplan A random Python package every time. Mystery is a Python package that is instantiated as a different package every time you install it! Inspired by one of our episodes Get our effective pycharm book bundle with the courses over at effectivepycharm.com Brian: Python 3.8.0b1 If you support a package, please test. Max: The Local Maximum Weekly Podcast that covers both the theoretical issues in probability theory, philosophy, and machine learning, but then applies it in a practical way to things like current events and product development. For example, a few weeks ago I did a show on how to estimate the probably of an event that has never occurred We also cover things like Apple’s decision to breakup iTunes, how the internet is shaping up in places like Cuba, and the controversy around YouTube’s recommendation algorithm. Jokes MK: There are only two hard problems in Computer Science: cache invalidation, naming things and off-by-one-errors.
June 12, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Three scientists publish a paper proving that Mercury, not Venus, is the closest planet to Earth. using Python contributed by, and explained by, listener Andrew Diederich. “This is from the March 19th, 2019 Strange Maps article. Which planet is, on average, closest to the Earth? Answer: Mercury. Actually, Mercury is, on average, the closest to all other planets, because it’s closest to the sun.” article, including video, uses PyEphem, which apparently is now deprecated and largely replaced with skyfield. Michael #2: Github semantics Parsing, analyzing, and comparing source code across many languages Written in a Haskell, it’s a library and command line tool for parsing, analyzing, and comparing source code. It’s still early days yet, but semantic can do a lot of cool things, and is powering public-facing GitHub features. I’m tremendously excited as to see how it’ll evolve now that it’s a community-facing project. Understands: Python, TypeScript, JavaScript, Ruby, Go, … here are some cool things inside it: A flow-sensitive, caching, generalized interpreter for imperative languages An abstract interpreter that generates scope graphs for a given program text A strategic rewriting system based on recursion schemes for open syntax terms Brian #3: flake8-black Contributed by Nathan Clayton “The point of this plugin is to be able to run black --check ... from within the flake8 plugin ecosystem.” I like to run flake8 during development both to keep things neat, and to train myself to just write code in a more standard way. This is a way to run black with no surprises. Michael #4: Python Preview for VS Code You write Python code (script style mostly), it creates an object-visualization Think of a picture your first year C++ CS prof might draw. This extension does that automatically as you write Python code Looks to be based (conceptually) on Philip Guo’s Python Tutor site. Brian #5: Create and Publish a Python Package with Poetry John Franey Walks through creating a package, customizing the pyproject.toml, and talks about the different settings in the toml and what it means. Then using the testpypi, and finally publish. Michael #6: Pointers in Python: What's the Point? by Logan Jones Quick question: Does Python have pointers (outside of C-extensions, etc of course)? Yet Python is more pointer heavy than most languages (more so than C# more so than even C++)! In Python, everything is an object, even numbers and booleans. Each object contains at least three pieces of data: Reference count Type Value Check that you have the same object is instead of == Python variables are pointers, just safe ones. Interesting little tidbit from the article: Interning strings is useful to gain a little performance on dictionary lookup—if the keys in a dictionary are interned, and the lookup key is interned, the key comparisons (after hashing) can be done by a pointer compare instead of a string compare. (Source) But like we have inline-assembly in C++ and unsafe mode in C#, we can use pointers in Cython or more fine-grained with ctypes. Extras Michael: PSF needs your help. Spread the word about the fundraiser and please, ask your company to contribute: Building the PSF: the Q2 2019 Fundraiser (Donations are tax-deductible for individuals and organizations that pay taxes in the United States) “Contributions help fund workshops, conferences, pay meetup fees, support fiscal sponsorships, PyCon financial aid, and development sprints. ” Jokes via Jay Miller What did the developer name his newborn boy? JSON
June 5, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Python built-ins worth learning Trey Hunner “I estimate most Python developers will only ever need about 30 built-in functions, but which 30 depends on what you’re actually doing with Python.” “I recommend triaging your knowledge: Things I should memorize such that I know them well Things I should know about so I can look them up more effectively later Things I shouldn’t bother with at all until/unless I need them one day” all 69 built-in functions, split into commonly known overlooked by beginners learn it later maybe learn it eventually you likely don’t need these Highlighting some: overlooked by beginners sum, enumerate, zip, bool, reversed, sorted, min, max, any, all know it’s there, but learn it later: open, input, repr, super, property, issubclass, isinstance, hasattr, getattr, setattr, delattr, classmethod, staticmethod, next my notes I think getattr should be learned early on, because it’s default behavior is so useful. But can’t use it for dicts. Use mydict.get(key, default) for dictionaries. Michael #2: Github sponsors and match Like Patreon but for GitHub projects 2x your sponsorship: Github matches! To boost community funding, we'll match contributions up to $5,000 during a developer’s first year in GitHub Sponsors with the GitHub Sponsors Matching Fund. 100% to developers, Zero fees: GitHub will not charge fees for GitHub Sponsors. Anyone who contributes to open source—whether through code, documentation, leadership, mentorship, design, or beyond—is eligible for sponsorship. Brian #3: Build a REST API in 30 minutes with Django REST Framework Bennett Garner Very fast intro including: Set up Django Create a model in the database that the Django ORM will manage Set up the Django REST Framework Serialize the model from step 2 Create the URI endpoints to view the serialized data Example is a simple hero db with hero name and alias. Michael #4: Dependabot has been acquired by GitHub Automated dependency updates: Dependabot creates pull requests to keep your dependencies secure and up-to-date. I personally use and recommend PyUP: https://pyup.io/ How it works: Dependabot checks for updates: Dependabot pulls down your dependency files and looks for any outdated or insecure requirements. Dependabot opens pull requests: If any of your dependencies are out-of-date, Dependabot opens individual pull requests to update each one. You review and merge: You check that your tests pass, scan the included changelog and release notes, then hit merge with confidence. Here's what you need to know: We're integrating Dependabot directly into GitHub, starting with security fix PRs 👮‍♂️ You can still install Dependabot from the GitHub Marketplace whilst we integrate it into GitHub, but it's now free of charge 🎁 We've doubled the size of Dependabot's team; expect lots of great improvements over the coming months 👩‍💻👨‍💻👩‍💻👨‍💻👩‍💻👨‍💻 Paid accounts are now free, automatically. Brian #5: spoof “New features planned for Python 4.0” Charles Leifer - also known for Peewee ORM This is funny, but painful. Is it too soon to joke about the pain of 2 to 3? A few of my favorites PEP8 will be updated. Line lengths will be increased to 89.5 characters. (compromise between 79 and 100) All new libraries and standard lib modules must include the phrase "for humans" somewhere in their title. Type-hinting has been extended to provide even fewer tangible benefits and will be called type whispering. You can make stuff go faster by adding async before every other keyword. Notable items left out of 4.0 Still no switch statement. No improvements to packaging. Michael #6: BlackSheep web framework Fast HTTP Server/Client microframework for Python asyncio, using Cython, uvloop, and httptools. Very Flask-like API. Interesting to consider the “popularity” of Flask vs Django in this context. Objectives Clean architecture and source code, following SOLID principles Intelligible and easy to learn API, similar to those of many Python web frameworks Keep the core package minimal and focused, as much as possible, on features defined in HTTP and HTML standards Targeting stateless applications to be deployed in the cloud High performance, see results from TechEmpower benchmarks (links in Wiki page) Also has an async client much like aiohttp. Extras Michael: Free courses in the Training mobile apps Upcoming webcast: 10 Tools and Techniques Python Web Developers Should Explore 2019 PSF Board Elections Get PyCharm, Support Python Until June 1st, get PyCharm at 30% OFF All the money raised will go toward the Python Software Foundation Jokes How do you generate a random string? Put a first year Computer Science student in Vim and ask them to save and exit. Waiter: He's choking! Is anyone a doctor? Programmer: I'm a Vim user.
May 30, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: History of CircuitPython PSF blog, A. Jesse Jiryu Davis Adafruit hired Scott Shawcroft to port MicroPython to their SAMD21 chip they use on many of their boards. CircuitPython is a friendly fork of MicroPython. Same licensing, and they share improvements back and forth. “MicroPython customizes its hardware APIs for each chip family to provide speed and flexibility for hardware experts. Adafruit’s audience, however, is first-time coders. Shawcroft said, “Our goal is to focus on the first five minutes someone has ever coded.” “ “Shawcroft aims to remove all roadblocks for beginners to be productive with CircuitPython. As he demonstrated, CircuitPython auto-reloads and runs code when the user saves it; there are two more user experience improvements in the latest release. First, serial output is shown on a connected display, so a program like print("hello world") will have visible output even before the coder learns how to control LEDs or other observable effects.” Related: CircuitPython 4.0.0 released Michael #2: R Risks Python Swallowing It Whole: TIOBE Is the R programming language in serious trouble? According to the latest update of the TIOBE Index, the answer seems to be “yes.” R has finally tumbled out of the top 20 languages “It seems that there is a consolidation going on in the statistical programming market. Python has become the big winner.” Briefly speculates why is Python (which ranked fourth on this month’s list) winning big in data science? My thought: Python is a full spectrum language with solid numerical support. Brian#3: The Missing Introduction To Containerization Aymen El Amri Understanding containerization through history chroot jail, 1979, allowed isolation of a root process and it’s children from the rest of the OS, but with no security restrictions. FreeBSD Jail, 2000, more secure, also isolating the file system. Linux VServer, 2001, added “security contextes” and used new OS system-level virtualization. Allows you to run multiple Linux distros on a single VPS. Oracle Solaris Containers, 2004, system resource controls and boundary separation provided by “zone”. OpenVZ, 2005, OS-level virtualization. Used by many hosting companies to isolate and sell VPSs. Google’s CGroups, 2007, a mechanizm to limit and isolate resource usage. Was mainlained into Linux kernel the same year. LXC, Linux Containers, 2008, Similar to OpenVX, but uses CGroups. CloudFoundry’s Warden, 2013, an API to manage environments. Docker, 2013, os-level virtualization Google’s LMCTFY (Let me contain that for you), 2014, an OSS version of Google’s container stack, providing Linux application containers. Most of this tech is being incorporated into libcontainer. “Everything at Google runs on containers. There are more than 2 billion containers running on Google infrastructure every week.” CoreOS’s rkt, 2014, an alternative to Docker. Lots of terms defined VPS, Virtual Machine, System VM, Process VM, … OS Containers vs App Containers Docker is both a Container and a Platform This is halfway through the article, and where I got lost in an example on creating a container sort of from scratch. I think I’ll skip to a Docker tutorial now, but really appreciate the back story and mental model of containers. Michael #4: Algorithms as objects We usually think of an algorithm as a single function with inputs and outputs. Our algorithms textbooks reinforce this notion. They present very concise descriptions that neatly fit in half of a page. Little details add up until you’re left with a gigantic, monolithic function monolithic function lacks readability the function also lacks maintainability Nobody wants to touch this code because it’s such a pain to get any context Complex code requires abstractions How to tell if your algorithm is an object Code smell #1. It’s too long or too deeply nested Code smell #2. Banner comments Code smell #3. Helper functions as nested closures, but it’s still too long Code smell #4. There are actual helper functions, but they shouldn’t be called by anyone else Code smell #5. You’re passing state between your helper functions Write your algorithm as an object Refactoring a monolithic algorithm into a class improves readability, which is is our #1 goal. Lots of concrete examples in the article Brian #5: pico-pytest Oliver Bestwalter Super tiny implementation of pytest core. 25 lines My original hand crafted test framework was way more code than that, and not as readable. This is good to look at to understand the heart of what test frameworks do find test code run it mark any exceptions as failures Of course, the bells and whistles added in the full implementation are super important, but this is the heart of what is happening. Michael #6: An Introduction to Cython, the Secret Python Extension with Superpowers Cython is one of the best kept secrets of Python. It extends Python in a direction that addresses many of the shortcomings of the language and the platform, such as execution speed, GIL-free concurrency, absence of type checking and not creating an executable. Number of widely used packages that are written in it, such as spaCy, uvloop, and significant parts of scikit-learn, Numpy and Pandas. Cython makes use of the architectural organization of Python by translating (or 'transpiling', as it is now called) a Python file into the C equivalent of what the Python runtime would be doing, and compiling this into machine code. Can sometimes avoid Python types altogether (e.g. sqrt function) C arrays versus lists: Python collection types (list, dict, tuple and set) can be used as a type in cdef functions. The problem with the list structure, however, is that it leads to Python runtime interaction, and is accordingly slow Nice article for getting started and motivation. But I didn’t see Python type annotations in play (they are now supported) Extras Brian: The Price of the Hallway Track - Hynek It’s lame to speak to an empty room, so go to some talks, and lean toward less known speakers. Definitely on my todo list for next year. Who put Python in the Windows 10 May 2019 Update? - Steve Dower more back story Michael: Little development board to production via Crowd Supply: The TinyPICO is an ESP32-based board that's, well, tiny ;) but packs a pretty significant punch...and it's been designed from day 1 to have first-class MicroPython support! via matt_trentini PyCon 2019 Reflections by Automation Panda Python Bytes (yeah, us!) has a Patreon page. Upcoming webcast: 10 Tools and Techniques Python Web Developers Should Explore Jokes What do you call eight hobbits? A hobbyte. Two bytes meet. The first byte asks, 'Are you ill?' The second byte replies, 'No, just feeling a bit off.’ OR: What is Benoit B. Mandelbrot's middle name? Benoit B. Mandelbrot.
May 21, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: PEP 581 (Using GitHub issues for CPython) is accepted PEP 581 The email announcing the acceptance. “The migration will be a large effort, with much planning, development, and testing, and we welcome volunteers who wish to help make it a reality. I look forward to your contributions on PEP 588 and the actual work of migrating issues to GitHub.” — Barry Warsaw Michael #2: Replace Nested Conditional with Guard Clauses Deeply nested code is problematic (does it have deodorant — err comments?) But what can you do? Guard clauses! See Martin Fowler’s article and this one. # BAD! def checkout(user): shipping, express = [], [] if user is not None: for item in user.cart: if item.is_available: shipping.append(item) if item.express_selected: express.append(item) return shipping, express # BETTER! def checkout(user): shipping, express = [], [] if user is None: return shipping, express for item in user.cart: if not item.is_available: continue shipping.append(item) if item.express_selected: express.append(item) return shipping, express Brian #3: Things you’re probably not using in Python 3 – but should Vinko Kodžoman Some of course items: f-strings Pathlib (side note. pytest tmp_path fixture creates temporary directories and files with PathLib) data classes Some I’m warming to: type hinting And those I’m really glad for the reminder of: enumerations from enum import Enum, auto class Monster(Enum): ZOMBIE = auto() WARRIOR = auto() BEAR = auto() print(Monster.ZOMBIE) # Monster.ZOMBIE built in lru_cache: easy memoization with the functools.lru_cache decorator. @lru_cache(maxsize=512) def fib_memoization(number: int) -> int: ... extended iterable unpacking >>> head, *body, tail = range(5) >>> print(head, body, tail) 0 [1, 2, 3] 4 >>> py, filename, *cmds = "python3.7 script.py -n 5 -l 15".split() >>> cmds ['-n', '5', '-l', '15'] >>> first, _, third, *_ = range(10) >>> first, third (0, 2) Michael #4: The Python Arcade Library Arcade is an easy-to-learn Python library for creating 2D video games. It is ideal for people learning to program, or developers that want to code a 2D game without learning a complex framework. Minesweeper games, hangman, platformer games in general. Check out Sample Games Made With The Arcade Library too Includes physics and other goodies Based on OpenGL Brian #5: Teaching a kid to code with Pygame Zero Matt Layman Scratch too far removed from coding. Using Mu to simplify coding interface. comes with a built in Python. Pygame Zero preinstalled “[Pygame Zero] is intended for use in education, so that teachers can teach basic programming without needing to explain the Pygame API or write an event loop.” Initial 29 line game taught: naming things and variables mutability and fiddling with “constants” to see the effect functions and side effects state and time interactions and mouse events Article also includes some tips on how to behave as the adult when working with kids and coding. Michael #6: Follow up on GIL / PEP 554 Has the Python GIL been slain? by Anthony Shaw multithreading in CPython is easy, but it’s not truly concurrent, and multiprocessing is concurrent but has a significant overhead. Because Interpreter state contains the memory allocation arena, a collection of all pointers to Python objects (local and global), sub-interpreters in PEP 554 cannot access the global variables of other interpreters. the way to share objects between interpreters would be to serialize them and use a form of IPC (network, disk or shared memory). All options are fairly inefficient But: PEP 574 proposes a new pickle protocol (v5) which has support for allowing memory buffers to be handled separately from the rest of the pickle stream. When? Pickle v5 and shared memory for multiprocessing will likely be Python 3.8 (October 2019) and sub-interpreters will be between 3.8 and 3.9. Extras Brian: PyCon 2019 videos are available So grateful for this. Already watched a couple, including Ant’s awesome talk about complexity and wily. pytest and hypothesis show up in the new Pragmatic Programmer book. Michael: 100 Days of Web course is out! Effective PyCharm book New release of our Android and iOS apps. Jokes MK → Waiter: Would you like coffee or tea? Programmer: Yes.
May 14, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Folks this one is light on notes since we did it live. Enjoy the show! Special guests Emily Morehouse Steve Dower Topics Brian #1: pgcli Michael #2: Papermill Emily #3: Python Language Summit Steve #4: Python in Windows 10
May 6, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Maintaining a Python Project when it’s not your job Paul #2: Python in 1994 youtube.com/watch?v=7NrPCsH0mBU Barry #3 Python leadership in 2019 Michael #4: Textblob stackabuse.com
May 2, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Solving Algorithmic Problems in Python with pytest Adam Johnson How to utilize pytest to set up quick test cases for coding challenges, like Project Euler or Advent of Code. Moving the specification and examples in the challenge description into test cases. Running the tests with a stub implementation and understanding the failure output. Gradually building up a working solution. Nice demo of how little code it takes to write quick test cases. Also a cool idea to use challenge sites and platforms as TDD/test first practice, as well as practice converting specifications into test cases. Michael #2: DepHell -- project management for Python via @dreigelb Why it is better than all other tools: Format agnostic. You can use DepHell with your favorite format: setup.py, requirements.txt, Pipfile, poetry. DepHell supports them all and much more. Use your favorite tool on any project. Want to install a poetry based project, but don't like poetry? Just say DepHell to convert project meta information into setup.py and install it with pip. Or directly work with the project from DepHell, because DepHell can do everything what you usually want to do with packages. DepHell doesn't try to replace your favorite tools. If you use poetry, you have to use poetry's file formats and commands. However, DepHell can be combined with any other tool or even combine all these tools together through formats converting. You can use DepHell, poetry and pip at the same time. Easily extendable. Pipfile should be just another one supported format for pip. However, pip is really old and big project with many bad decisions, so, PyPA team can't just add new features in pip without fear to broke everything. This is how pipenv has been created, but pipenv has inherited almost all problems of pip and isn't extendable too. DepHell has strong modularity and can be easily extended by new formats and commands. Developers friendly. We aren't going to place all our modules into [_internal](https://github.com/pypa/pip/tree/master/src/pip/_internal). Also, DepHell has big ecosystem with separated libraries to help you use some DepHell's parts without pain and big dependencies for your project. All-in-one-solution. DepHell can manage dependencies, virtual environments, tests, CLI tools, packages, generate configs, show licenses for dependencies, make security audit, get downloads statistic from pypi, search packages and much more. None of your tools can do it all. Smart dependency resolution. Sometimes pip and pipenv can't lock your dependencies. Try to execute pipenv install oslo.utils==1.4.0. Pipenv can't handle it, but DepHell can: dephell deps add --from=Pipfile oslo.utils==1.4.0 to add new dependency and dephell deps convert --from=Pipfile --to=Pipfile.lock to lock it. Asyncio based. DepHell doesn't support Python 2.7, and that allows us to use modern features to make network and filesystem requests as fast as possible. Multiple environments. You can have as many environments for project as you want. Separate sphinx dependencies from your main and dev environment. Other tools like pipenv and poetry don't support it. Brian #3 Python rant: from foo import is bad Mike Croucher I’m glad to see this post because I’m still seeing this practice a lot, even in tutorial blog posts! This is meaningless: result = sqrt(-1) Is it: math.sqrt(-1)? or numpy.sqrt(-1) or cmath.sqrt(-1)? or scipy? or sympy? Recommendation: Never do from x import * Use import math or import numpy as np or even from scipy import sqrt Michael #4: Dask Dask natively scales Python Have numpy, pandas, and scikit-learn code that needs to go faster? Run these on smart clusters of servers Or just on your laptop Process more data than will fit into RAM Supported by… interesting to see proper support there. Matthew Rocklin was on Talk Python 207 to discuss Brian #5: Animations with Matplotlib Parul Pandey The raindrop simulation is mesmerizing. Tutorial on using FuncAnimation to animate a sine wave although, I’m not sure what the x axis means during an animation Also: live updates based on changing data animate turning a 3D plot using celluloid package to animate simple example animating subplots changing legend during animation Michael #6: PEP 554 -- Multiple Interpreters in the Stdlib This proposal introduces the stdlib interpreters module. The module will be provisional. It exposes the basic functionality of subinterpreters already provided by the C-API, along with new (basic) functionality for sharing data between interpreters. Sharing data centers around "channels", which are similar to queues and pipes. Examples and use-cases: Running isolated code In process, true parallelism Versioning of modules (?) Plugin systems Extras Michael: iOS Talk Python Training app is out: training.talkpython.fm/apps Find us at PyCon! Blessings terminal API (from Erik Rose, via Prayson Daniel) Jokes via Topher Chung Knock knock. Race condition. Who's there?
April 25, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Kenneth Reitz Brian #1: inline_python (for rust) “I just made a Frankenstein's monster: Python code embedded directly in rustlang code. Should I kill it before it escapes the lab?” - Mara Bos Writing some rust, and need a little Python? Maybe want to pop open a matplotlib window? This may be just the thing you need. see also: https://pypi.org/project/bash/ Kenneth #2: Requests3: Under Way! Requests 2.x that you know and love is going into CVE-only mode (which it has been for a long time). Requests III is a new project which will bring async/await keywords to Requests. installable as requests3. Type-Annotations Python 3.6+ Michael #3: 🔥 Pyflame: A Ptracing Profiler For Python Pyflame is a high performance profiling tool that generates flame graphs for Python. Pyflame is implemented in C++, and uses the Linux ptrace(2) system call to collect profiling information. It can take snapshots of the Python call stack without explicit instrumentation Capable of profiling embedded Python interpreters like uWSGI. Fully supports profiling multi-threaded Python programs. Why use it? Pyflame usually introduces significantly less overhead than the builtin profile (or cProfile) modules, and emits richer profiling data. The profiling overhead is low enough that you can use it to profile live processes in production. Brian #4: flit + src Currently a WIP PR. flit is easy. Given a module or a source package. flit init creates pyproject.toml and LICENSE files. commit those to git flit build creates a wheel flit publish (builds and) publishes to whatever you have in your [.pypirc](https://docs.python.org/3/distutils/packageindex.html#the-pypirc-file) Changes in this PR The flit project already has 2 types of projects. just a module, like foo.py a package (directory with __init__.py), like foo/__init__.py This would add a 3rd and 4th. just a module, but in src, like src/foo.py a package in src, like src/foo/__init__.py May be cracking open a can of worms, but I’m ok with that. Kenneth #5: $ pipx install pipenv Michael #6: cheat.sh via Jon Bultmeyer Nothing to install, but works on the CLI $ http cht.sh/python/sort+list $ http cht.sh/python/connect+to+database Has a CLI client too with a proper shell Get started with http cht.sh/python/:learn Has a funky stealth mode too Editor integration VS Code & Vim cheat.sh uses selected community driven cheat sheet repositories and information sources, maintained by thousands of users, developers and authors all over the world Extras Brian: vi is good for beginners - fun read, for all you haters out there. But use vim, not vi. Better yet, IdeaVim for PyCharm or VSCodeVim for VS Code. nbstripout - command line tool to strip output from Jupyter Notebook files. We covered pyodide on episode 93, but here’s a cool article on it Pyodide: Bringing the scientific Python stack to the browser Michael: PyCon AU CFP LIGO Blackhole collision follow up: https://www.youtube.com/watch?v=BXID4teFfDc via Dave Kirby and Matthew Feickert https://github.com/kylebebak/questionnaire like Bullet but for windows too via Sander Teunissen Kenneth (optional): PyColorado CFP PyOhio CFP PyRemote! Jokes Don’t know if I’ll do all of these, but I like them. 🙂 Brian and Kenneth, feel free to add yours if you have some! MK: Ubuntu users are apt to get these jokes. MK: How many programmers does it take to kill a cockroach? Two: one holds, the other installs Windows on it. MK: A programmer had a problem. He thought to himself, 'I know, I'll solve it with threads!'. has Now problems. two he (mildly offensive) KR: What’s the difference between a musician and a pizza? A pizza can feed a family of four. (In collaboration with Jonatan Skogsfors) Python used to be directed by the BDFL, Guido. Now it’s directed by a steering council, GUIDs[0:4].
April 19, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Cecil Philip Brian #1: Python Used to Take Photo of Black Hole Lots of people talking about this. The link I’m including is a quick write up by Mike Driscoll. From now on these conversations can happen: “So, what can you do with Python?” “Well, it was used to help produce the worlds first image of a black hole. Your particular problem probably isn’t as complicated as that, so Python should work fine.” Projects listed in the paper: “First M87 Event Horizon Telescope Results. III. Data Processing and Calibration”: Numpy (van der Walt et al. 2011) Scipy (Jones et al. 2001) Pandas (McKinney 2010) Jupyter (Kluyver et al. 2016) Matplotlib (Hunter 2007). Astropy (The Astropy Collaboration et al. 2013, 2018) Cecil #2: Wasmer - Python Library for executing WebAssembly binaries WebAssembly (Wasm) enables high level languages to target a portable format that runs in the web Tons of languages compile down to Wasm but Wasmer enables the consumption of Wasm in python This enables an interesting use case for using Wasm as a way to leverage code between languages Michael #3: Cooked Input cooked_input is a Python package for getting, cleaning, converting, and validating command line input. Name comes from input / raw_input (unvalidated) and cooked input (validated) Beginner’s can use the provided convenience classes to get simple inputs from the user. More complicated command line application (CLI) input can take advantage of cooked_input’s ability to create commands, menus and data tables. All sorts of cool validates and cleaners Examples cap_cleaner = ci.CapitalizationCleaner(style=ci.ALL_WORDS_CAP_STYLE) ci.get_string(prompt="What is your name?", cleaners=[cap_cleaner]) >>> ci.get_int(prompt="How old are you?", minimum=1) How old are you?: abc "abc" cannot be converted to an integer number How old are you?: 0 "0" too low (min_val=1) How old are you?: 67 67 Brian #4: JetBrains and PyCharm officially collaborating with Anaconda PyCharm 2019.1.1 has some improvements for using Conda environments. Fixed various bugs related to creating Conda envs and installing packages into them. Special distribution of PyCharm: PyCharm for Anaconda with enhanced Anaconda support. I’m using PyCharm Pro with vim emulation this week to edit a notebook based presentation. I might run them in Jupyter, or just run it in PyCharm, but editing with all my normal keyboard shortcuts is awesome. Cecil #5: Building a Serverless IoT Solution with Python Azure Functions and SignalR Interesting blog post on using serverless, IoT, real-time messaging to create a live dashboard Shows how to create a serverless function in Python to process IoT data There’s tons of DIY applications for using this technique at home The Dashboard is a static website using D3 for charting. Michael #6: multiprocessing.shared_memory — Provides shared memory for direct access across processes New in Python 3.8 This module provides a class, SharedMemory, for the allocation and management of shared memory to be accessed by one or more processes on a multicore or symmetric multiprocessor (SMP) machine. The ShareableList looks nice to use. Extras Brian: Getting ready for PyCon with STICKERS. Yeah, baby. Come see us at PyCon. I’ll also be bringing some copies of Python Testing with pytest, if anyone doesn’t already have a copy. Lots of interviews going on for Test & Code, and some will happen at PyCon. Cecil: Attendee Detector Workshop Talk Python training app on Android Michael: Guido van Rossum interviewed on MIT’s AI podcast via Tony Cappellini Visual Studio IntelliCode for VS & VS Code Showing a Craigslist scammer who's boss using Python via Dan Koster Jokes Brian: To understand recursion you must first understand recursion. Michael: A programmer was found dead in the shower. Next to their body was a bottle of shampoo with the instructions 'Lather, Rinse and Repeat'.
April 13, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: My How and Why: pyproject.toml & the 'src' Project Structure Brian Skinn pyproject.toml but with setuptools, instead of flit or poetry with a src dir and tox and black all the bits and pieces to make all of this work Michael #2: The Deadlock Empire: Slay dragons, master concurrency! A game to test your thread safety and skill! Deadlocks occur in code when two threads end up trying to enter two or more locks (RLocks please!) Consider lock_a and lock_b Thread one enters lock_a and will soon enter lock_b Thread two enters lock_b and will soon enter lock_a Imagine transferring money between two accounts, each with a lock, and each thread does this in opposite order. Brian #3: Cog 3.0 Ned Batchelder’s cog gets an update (last one was a few years ago). “Cog … finds snippets of Python in text files, executes them, and inserts the result back into the text. It’s good for adding a little bit of computational support into an otherwise static file.” Development moved from Bitbucket to GitHub. Travis and Appveyor CI. The biggest functional change is that errors during execution now get reasonable tracebacks that don’t require you to reverse-engineer how cog ran your code. mutmut mutation testing added. Cool. What I want to know more about is this statement: “…now I use it for making all my presentations”. Very cool idea. Michael #4: StackOverflow 2019 Developer Survey Results More good news for Python Lots of focus on gender in this one Contributing to Open Source About 65% of professional developers on Stack Overflow contribute to open source projects once a year or more. Involvement in open source varies with language. Developers who work with Rust, WebAssembly, and Elixir contribute to open source at the highest rates, while developers who work with VBA, C#, and SQL do so at about half those rates. Competence and Experience We see evidence here among the most junior developers for impostor syndrome, pervasive patterns of self-doubt, insecurity, and fear of being exposed as a fraud. Among our respondents, men grew more confident much more quickly than gender minorities. Programming, Scripting, and Markup Languages Python edges out Java, second only to JavaScript (and two non-programming languages) Databases MySQL, Postgres, Microsoft SQL Server, SQLite, MongoDB Most Loved, Dreaded, and Wanted Languages Loved: Rust, Python Wanted: Python, JavaScript Dreaded: VBA, ObjectiveC Most Loved, Dreaded, and Wanted Databases Loved: Postgres Wanted: MongoDB Most Popular Development Environments VS Code is crushing it How Technologies Are Connected is just interesting Brian #5: Cuv’ner “A commanding view of your test-coverage" Coverage visualizations on the console. Michael #6: Mobile apps launched The tech (sadly only 50% Python) Xamarin, Mono, and C# on the device-side Python, Pyramid, and MongoDB on the server-side 90% code sharing or higher Native applications Build the prototype myself on Windows Hired Giorgi via TopTal Get your own developer or get some freelancing work and support my app progress with my referral code: toptal.com/#we-annexed-perfect-engineers Dear mobile app developers: You have my sympathy! Try the app at training.talkpython.fm/apps Comes with 2 free courses for anyone who logs in. Android only at the moment but not for long Extras Brian: Python Bytes Patreon page is up: patreon.com/pythonbytes Michael: PyCon Booth XKCD Plots in Matplotlib with examples via Tim Harrison Fira Code Retina and Font Ligatures The EuroSciPy 2019 Conference will take place from September 2 to September 6 in Bilbao, Spain Jokes “When your hammer is C++, everything begins to look like a thumb.” “Why don't jokes work in octal? Because 7 10 11” Over explained: Why is 6 afraid of 7. Cuz 7 8 9. Follow on: Why did 7 eat 9? He was trying to eat 3^2 meals. I've been using Vim for a long time now, mainly because I can't figure out how to exit.
April 5, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: pytest 4.4.0 Lots of amazing new features here (at least for testing nerds) testpaths displayed in output, if used. pytest.ini setting that allows you to specify a list of directories or tests (relative to test rootdir) to test. (can speed up test collection). Lots of goodies for plugin writers. Internal changes to allow subtests to work with a new plugin, pytest-subtests. Just started playing with it, but I’m excited already. Planning on a full Test & Code episode after I play with it a bit more. # unittest example: class T(unittest.TestCase): def test_foo(self): for i in range(5): with self.subTest("custom message", i=i): self.assertEqual(i % 2, 0) # pytest example: def test(subtests): for i in range(5): with subtests.test(msg="custom message", i=i): assert i % 2 == 0 Michael #2: requests-async async-await support for requests Just finished talking with Kenneth Reitz, native async coming to requests, but awhile off Nice interm solution Requires modern Python (3.6) Interesting Flask, Quart, Starlette, etc. framework wrapper for testing Brian #3: Reasons why PyPI should not be a service Dustin Ingram’s article: PyPI as a Service “Layoffs at JavaScript package registry raise questions about fate of community resource” - The Register article Apparently PyPI gets requests for a private form of their service regularly, but there are problems with that. Currently a non-profit project under the PSF. That may be hard to maintain if they have a for-profit part. Donated services and infrastructure of more than $1M/year would be hard to replace. There are already other package repository options. Although there is probably room for others to compete. Currently run by volunteers for the most part. (
March 29, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Deconstructing xkcd.com/1987/ Brett Cannon Breakdown of the infamous xkcd comic poking fun at the authors Python Environment on his computer. The interpreters listed Homebrew description python.org binaries A discussion of pip, easy_install The paths and the $PATH and $PYTHONPATH Actually quite an educational history lesson, and the abuse some people put their computers through. “So the next time someone decides to link to this comic as proof that Python has a problem, you can say that it's actually Randall's problem.” Michael #2: Python package as a CLI option Wanted to make this little app available via a CLI as a dedicated command. Really tired of python3 script.py or ./script.py Turns out, pip and Python already solve this problem, if you structure your package correctly Thanks to everyone on Twitter! The trick turns out to be to have entrypoints in your package entry_points = { "console_scripts": ['bootstrap = bootstrap.bootstrap:main'] } ... This should even register it with pipx install package ;) Brian #3: pyright a Microsoft static type checker for the Python language. “Pyright was created to address gaps in existing Python type checkers like mypy.” 5x faster than mypy meant for large code bases written in TypeScript and runs within node. Michael #4: Refactoring Python Applications for Simplicity If you can write and maintain clean, simple Python code, then it’ll save you lots of time in the long term. You can spend less time testing, finding bugs, and making changes when your code is well laid out and simple to follow. Is your code complex? Metrics for Measuring Complexity Lines of Code Cyclomatic complexity is the measure of how many independent code paths there are through your application. Maintainability Index Refactoring: The technique of changing an application (either the code or the architecture) so that it behaves the same way on the outside, but internally has improved. Nice overview of tooling (PyCharm, VS Code plugins, etc) Anti-patterns and ways out of them (best part of the article IMO) Brian #5: FastAPI Thanks Colin Sullivan for suggesting the topic “FastAPI framework, high performance, easy to learn, fast to code, ready for production” “Sales pitch / key features: Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic). One of the fastest Python frameworks available. Fast to code: Increase the speed to develop features by about 200% to 300%. (estimated) Fewer bugs: Reduce about 40% of human (developer) induced errors. (estimated) Intuitive: Great editor support. Completion everywhere. Less time debugging. Easy: Designed to be easy to use and learn. Less time reading docs. Short: Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs. Robust: Get production-ready code. With automatic interactive documentation. Standards-based: Based on (and fully compatible with) the open standards for APIs: OpenAPI(previously known as Swagger) and JSON Schema.” uses: Starlette for the web parts. Pydantic for the data parts. document REST apis with both Swagger ReDoc looks like quite a fun contender in the “put together a REST API quickly” set of solutions out there. Just the front page demo is quite informative. There’s also a tutorial that seems like it might be a crash course in API best practices. Michael #6: Bleach: stepping down as maintainer by Will Kahn-Greene Bleach is a Python library for sanitizing and linkifying text from untrusted sources for safe usage in HTML. A retrospective on OSS project maintenance Picked up maintenance of the project because I was familiar with it current maintainer really wanted to step down Mozilla was using it on a bunch of sites I felt an obligation to make sure it didn't drop on the floor and I knew I could do it. Never really liked working on Bleach He did a bunch of work on a project I don't really use, but felt obligated to make sure it didn't fall on the floor, that has a pain-in-the-ass problem domain. Did that for 3+ years. Is [he] getting paid to work on it? Not really. Does [he] like working on it? No. Seems like [he] shouldn't be working on it anymore. Extras Brian sleepsort Michael: Passbolt Python 3.7.3 is now available stackroboflow via Alexander Allori Joke
March 22, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Combining and separating dictionaries PEP 584 -- Add + and - operators to the built-in dict class. Steven D'Aprano Draft status, just created 1-March-2019 d1 + d2 would merge d2 into d1 like {**d1, **d2} or on two lines d = d1.copy() d.update(d2) of note, (d1 + d2) != (d2 + d1) Currently no subtraction equivalent Guido’s preference of + over | Related, Why operators are useful - also by Guido Michael #2: Why I Avoid Slack by Matthew Rocklin I avoid interacting on Slack, especially for technical conversations around open source software. Instead, I encourage colleagues to have technical and design conversations on GitHub, or some other system that is public, permanent, searchable, and cross-referenceable. Slack is fun but, internal real-time chat systems are, I think, bad for productivity generally, especially for public open source software maintenance. Prefer GitHub because I want to Engage collaborators that aren’t on our Slack Record the conversation in case participants change in the future. Serve the silent majority of users who search the web for answers to their questions or bugs. Encourage thoughtful discourse. Because GitHub is a permanent record it forces people to think more before they write. Cross reference issues. Slack is siloed. It doesn’t allow people to cross reference people or conversations across Slacks Brian #3: Hunting for Memory Leaks in Python applications Wai Chee Yau Conquering memory leaks and spikes in Python ML products at Zendesk. A quick tutorial of some useful memory tools The memory_profiler package and matplotlib to visualize memory spikes. Using muppy to heap dump at certain places in the code. objgraph to help memory profiling with object lineage. Some tips when memory leak/spike hunting: strive for quick feedback run memory intensive tasks in separate processes debugger can add references to objects watch out for packages that can be leaky pandas? really? Michael #4: Give Me Back My Monolith by Craig Kerstiens Feels like we’re starting to pass the peak of the hype cycle of microservices We’ve actually seen some migrations from micro-services back to a monolith. Here is a rundown of all the things that were simple that you now get to re-visit Setup went from intro chem to quantum mechanics Onboarding a new engineering, at least for an initial environment would be done in the first day. As we ventured into micro-services onboarding time skyrocketed So long for understanding our systems Back when we had monolithic apps if you had an error you had a clear stacktrace to see where it originated from and could jump right in and debug. Now we have a service that talks to another service, that queues something on a message bus, that another service processes, and then we have an error. If we can’t debug them, maybe we can test them All the trade-offs are for a good reason. Right? Brian #5: Famous Laws Of Software Development Tim Sommer 13 “laws” of software development, including Hofstadter’s Law: “It always takes longer than you expect, even when you take into account Hofstadter's Law.” Conway’s Law: “Any piece of software reflects the organizational structure that produced it.” The Peter Principle: “In a hierarchy, every employee tends to rise to his level of incompetence.” Ninety-ninety rule: “The first 90% of the code takes 10% of the time. The remaining 10% takes the other 90% of the time” Michael #6: Beer Garden Plugins A powerful plugin framework for converting your functions into composable, discoverable, production-ready services with minimal overhead. Beer Garden makes it easy to turn your functions into REST interfaces that are ready for production use, in a way that’s accessible to anyone that can write a function. Based on MongoDB, Rabbit MQ, & modern Python Nice docker-compose option too Extras Michael: Firefox Send Ethical ads on Python Bytes (and Talk Python) Brian: T&C 69: The Pragmatic Programmer — Andy Hunt not up yet, but will be before this episode is released Jokes From Derrick Chambers “What do you call it when a python programmer refuses to implement custom objects? self deprivation! Sorry, that joke was really classless.” via pyjokes: I had a problem so I thought I'd use Java. Now I have a ProblemFactory.
March 16, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Futurize and Auto-Futurize Staged automatic conversion from Python2 to Python3 with futurize from python-future.org pip install future Stages: 1: safe fixes: exception syntax, print function, object base class, iterator syntax, key checking in dictionaries, and more 2: Python 3 style code with wrappers for Python 2 more risky items to change separating text from bytes, quite a few more very modular and you can be more aggressive and more conservative with flags. Do that, but between each step, run tests, and only continue if they pass, with auto-futurize from Timothy Hopper. a shell script that uses git to save staged changes and tox to test the code. Michael #2: Tech blog writing live stream via Anthony Shaw Live stream on "technical blog writing" Talking about how I put articles together, research, timing and other things about layouts and narratives. Covers “Modifying the Python language in 6 minutes”, deep article Listicals, “5 Easy Coding Projects to Do with Kids” A little insight into what is popular. Question article: Why is Python Slow? Tourists guide to the CPython source code Brian #3: Try out walrus operator in Python 3.8 Alexander Hultnér The walrus operator is the assignment expression that is coming in thanks to PEP 572. # From: https://www.python.org/dev/peps/pep-0572/#syntax-and-semantics # Handle a matched regex if (match := pattern.search(data)) is not None: # Do something with match # A loop that can't be trivially rewritten using 2-arg iter() while chunk := file.read(8192): process(chunk) # Reuse a value that's expensive to compute [y := f(x), y**2, y**3] # Share a subexpression between a comprehension filter clause and its output filtered_data = [y for x in data if (y := f(x)) is not None] This article walks through trying this out with the 3.8 alpha’s now available. Using pyenv and brew to install 3.8, but you can also just download it and try it out. 3.8.0a1: https://www.python.org/downloads/release/python-380a1/ 3.8.0a2: https://www.python.org/downloads/release/python-380a2/ Ends with a demonstration of the walrus operator working in a (I think) very likely use case, grabbing a value from a dict if the key exists for entry in sample_data: if title := entry.get("title"): print(f'Found title: "{title}"') That code won’t fail if the title key doesn’t exist. Michael #4: bullet : Beautiful Python Prompts Made Simple Have you ever wanted a dropdown select box for your CLI? Bullet! Lots of design options Also Password “boxes” Yes/No Numbers Looking for contributors, especially Windows support. Brian #5: Hosting private pip packages using Azure Artifacts Interesting idea to utilize artifacts as a private place to store built packages to pip install elsewhere. Walkthrough is assuming you are working with a data pipeline. You can package some of the work in earlier stages for use in later stages by packaging them and making them available as artifacts. Includes a basic tutorial on setuptools packaging and building an sdist and a wheel. Need to use CI in the Azure DevOps tool and use that to build the package and save the artifact Now in a later stage where you want to install the package, there are some configs needed to get the pip credentials right, included in the article. Very fun article/hack to beat Azure into a use model that maybe it wasn’t designed for. Could be useful for non data pipeline usage, I’m sure. Speaking of Azure, we brought up Anthony Shaw’s pytest-azurepipelines pytest plugin last week. Well, it is now part of the recommended Python template from Azure. Very cool. Michael #6: Async/await for wxPython via Andy Bulka Remember asyncio and PyQt from last week? Similar project called wxasync which does the same thing for wxPython! He’s written a medium article about it https://medium.com/@abulka/async-await-for-wxpython-c78c667e0872 with links to that project, and share some real life usage scenarios and fun demo apps. wxPython is important because it's free, even for commercial purposes (unlike PyQt). His article even contains a slightly controversial section entitled "Is async/await an anti-pattern?" which refers to the phenomenon of the async keyword potentially spreading through one's codebase, and some thoughts on how to mitigate that. Extras Michael: Mongo license followup Will S. told me I was wrong! And I was. :) The main clarification I wanted to make above was that the AGPL has been around for a while, and it is the new SSPL from MongoDB that targets cloud providers. Also, one other point I didn't mention -- the reason the SSPL isn't considered open source is that it places additional conditions on providing the software as a service and the OSI's open source definition requires no discrimination based on field of endeavor. Michael: python2 becomes self-aware, enters fifth stage of grief Funny thread I started python2 -m pip list DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. Michael: PyDist — Simple Python Packaging Your private and public dependencies, all in one place. Looks to be paid, but with free beta? It mirrors the public PyPI index, and keeps packages and releases that have been deleted from PyPI. It allows organizations to upload their own private dependencies, and seamlessly create private forks of public packages. And it integrates with standard Python tools almost as well as PyPI does. Joke A metajoke: pip install --user pyjokes or even better pipx install pyjokes. Then: $ pyjoke [hilarity ensues! …]
March 5, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: The Ultimate Guide To Memorable Tech Talks Nina Zakharenko 7 part series that covers choosing a topic, writing a talk proposal, tools, planning, writing, practicing, and delivering the talk I’ve just read the tools section, and am looking forward to the rest of the series. From the tools section: “I noticed I’d procrastinate on making the slides look good instead of focusing my time on making quality content.” Michael #2: Running Flask on Kubernetes via TestDriven.io & Michael Herman What is Kubernetes? A step-by-step tutorial that details how to deploy a Flask-based microservice (along with Postgres and Vue.js) to a Kubernetes cluster. Goals of tutorial Explain what container orchestration is and why you may need to use an orchestration tool Discuss the pros and cons of using Kubernetes over other orchestration tools like Docker Swarm and Elastic Container Service (ECS) Explain the following Kubernetes primitives - Node, Pod, Service, Label, Deployment, Ingress, and Volume Spin up a Python-based microservice locally with Docker Compose Configure a Kubernetes cluster to run locally with Minikube Set up a volume to hold Postgres data within a Kubernetes cluster Use Kubernetes Secrets to manage sensitive information Run Flask, Gunicorn, Postgres, and Vue on Kubernetes Expose Flask and Vue to external users via an Ingress Brian #3: Changes in the CI landscape Travis CI joins the Idera family - TravisCI blog #travisAlums on Twitter “TravisCI is laying off a bunch of senior engineers and other technical staff. Look at the #travisAlums hashtag and hire them!” - alicegoldfuss options: GitHub lists 17 options for CI, including GitLab & Azure Pipelines Some relevant articles, resources: The CI/CD market consolidation - GitLab article Azure Pipelines with Python — by example - Anthony Shaw pytest-azurepipelines - Anthony Shaw Azure Pipelines Templates - Anthony Sottile Michael #4: Python server setup for macOS 🍎 what: hello world for Python server setup on macOS why: most guides show setup on a Linux server (which makes sense) but macoS is useful for learning and for local dev STEP 1: NGINX ➡️ STATIC ASSETS STEP 2: GUNICORN ➡️ FLASK STEP 3: NGINX ➡️ GUNICORN Brian #5: Learn Enough Python to be Useful: argparse How to Get Command Line Arguments Into Your Scripts - Jeff Hale “argparse is the “recommended command-line parsing module in the Python standard library.” It’s what you use to get command line arguments into your program. “I couldn’t find a good intro guide for argparse when I needed one, so I wrote this article.” Michael #6: AWS, MongoDB, and the Economic Realities of Open Source Related podcast: https://soundcloud.com/exponentfm/episode-159-inverted-pyramids Last week, from the AWS blog: Today we are launching Amazon DocumentDB (with MongoDB compatibility), a fast, scalable, and highly available document database that is designed to be compatible with your existing MongoDB applications and tools. Amazon DocumentDB uses a purpose-built SSD-based storage layer, with 6x replication across 3 separate Availability Zones. The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads. Like an increasing number of such projects, MongoDB is open source…or it was anyways. MongoDB Inc., a venture-backed company that IPO’d in October, 2017, made its core database server product available under the GNU Affero General Public License (AGPL). AGPL extended the GPL to apply to software accessed over a network; since the software is only being used, not copied MongoDB’s Business Model We believe we have a highly differentiated business model that combines the developer mindshare and adoption benefits of open source with the economic benefits of a proprietary software subscription business model. MongoDB enterprise and MongoDB atlas Basically, MongoDB sells three things on top of its open source database server: Additional tools for enterprise companies to implement MongoDB A hosted service for smaller companies to use MongoDB Legal certainty What AWS Sells the value of software is typically realized in three ways: First is hardware. Second is licenses. This was Microsoft’s core business for decades: licenses sold to OEMs (for the consumer market) or to companies directly (for the enterprise market). Third is software-as-a-service. AWS announced last week: > The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads. AWS is not selling MongoDB: what they are selling is “performance, scalability, and availability.” DocumentDB is just one particular area of many where those benefits are manifested on AWS. Thus we have arrived at a conundrum for open source companies: MongoDB leveraged open source to gain mindshare. MongoDB Inc. built a successful company selling additional tools for enterprises to run MongoDB. More and more enterprises don’t want to run their own software: they want to hire AWS (or Microsoft or Google) to run it for them, because they value performance, scalability, and availability. This leaves MongoDB Inc. not unlike the record companies after the advent of downloads: what they sold was not software but rather the tools that made that software usable, but those tools are increasingly obsolete as computing moves to the cloud. And now AWS is selling what enterprises really want. This tradeoff is inescapable, and it is fair to wonder if the golden age of VC-funded open source companies will start to fade (although not open source generally). The monetization model depends on the friction of on-premise software; once cloud computing is dominant, the economic model is much more challenging. Extras: PyTexas 2019 at #Austin on Apr 13th and 14th. Registrations now open. More info at pytexas.org/2019/ Michael: Sorry Ant! Michael: RustPython follow up: https://rustpython.github.io/demo/ Joke: Q: Why was the developer unhappy at their job? A: They wanted arrays. Q: Where did the parallel function wash its hands? A: Async
February 26, 2019
Sponsored by pythonbytes.fm/datadog Special guests Eric Chou Dan Bader Trey Hunner Michael #1: Incrementally migrating over one million lines of code from Python 2 to Python 3 Weighing in at over 1 million lines of Python logic, we had a massive surface area for potential issues in our migration from Python 2 to Python 3 First Py3 commit, hack week 2015 Unfortunately, it was clear that many features were completely broken by the upgrade Official start H1 2017 Armed with Mypy, a static type-checking tool that we had adopted in the interim year, they made substantial strides towards enabling the Python 3 migration: Ported our custom fork of Python to version 3.5 Upgraded some Python dependencies to Python 3-compatible versions, and forked some others (e.g. babel) Modified some Dropbox client code to be Python 3 compatible Set up automated jobs in our continuous integration (CI) to run the existing unit tests with the Python 3 interpreter, and Mypy type-checking in Python 3 mode Crucially, the automated tests meant that we could be certain that the limited Python 3 compatibility that existed would not have regressed when the project was picked up again. Prerequisites Before we could begin working on migrating any of our application logic, we had to ensure that we could load the Python 3 interpreter and run until the entry point of the application. In the past, we had used “freezer” scripts to do this for us. However, none of these had support for Python 3 around this time, so in late 2016, we built a custom, more native solution which we internally referred to as “Anti-freeze” (more on that in the initial Python 3 migration blog post). Incrementally enabling unit tests and type-checking ‘Straddling’ Python 2 and Python 3 Letting it bake Learnings (tl;dr) Unit tests and typing are invaluable. String encoding in Python is hard. Incrementally migrate to Python 3 for great profit. Eric #2: Network Automation Development with Python (for fun and for profit) Terms: NetDevOps (Cisco), NRE (Network Reliability Engineer) Libraires: Netmiko, NAPALM, Nornir Free Lab Resources: NRE Labs, dCloud, DevNet Conferences: AnsibleFest (network automation track), Cisco DevnetCreate Trey #3: Alkali file as DB If you have structured data you want to query (like RSS feed, CSV, JSON, or any custom format of your own creation) you can use a Django ORM-like syntax to query it Save it to the same format or a different format because you control both the reading and the writing Kurt is at PyCascades so I got to chat with him about this Dan #4: Carnegie Mellon Launches Undergraduate Degree in Artificial Intelligence ** Carnegie Mellon University's School of Computer Science will offer a new undergraduate degree in artificial intelligence beginning this fall The first offered by a U.S. university "Specialists in artificial intelligence have never been more important, in shorter supply or in greater demand by employers," said Andrew Moore, dean of the School of Computer Science. The bachelor's degree in AI will focus more on how complex inputs — such as vision, language and huge databases — are used to make decisions or enhance human capabilities Michael #5: asyncio + PyQt5/PySide2 via Florian Dahlitz asyncqt is an implementation of the PEP 3156 event-loop with Qt. This package is a fork of quamash focusing on modern Python versions, with some extra utilities, examples and simplified CI. Allows wiring events to Qt’s event loop that run on asyncio and leverage it internally. Example: https://github.com/gmarull/asyncqt/blob/master/examples/aiohttp_fetch.py Dan #6: 4 things I want to see in Python 4.0 JIT as a first class feature A stable .0 release Static type hinting A GPU story for multiprocessing More community contributions Extras: Michael: My Python Async webcast recording is now available. Michael: PyCon Israel in the first week of June (https://il.pycon.org/2019/), and the CFP opened today: https://cfp.pycon.org.il/conference/cfp Dan: Python Basics Book Joke: Q: Why did the developer ground their kid? A: They weren't telling the truthy
February 22, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: Frozen-Flask “Frozen-Flask freezes a Flask application into a set of static files. The result can be hosted without any server-side software other than a traditional web server.” 2012 tutorial, Dead easy yet powerful static website generator with Flask Some of it is out of date, but it does point to the power of Frozen-Flask, as well as highlight a cool plugin, Flask-FlatPages, which allows pages from markdown. Michael #2: pipx by Chad Smith Last week we spoke about pythonloc Execute binaries from Python packages in isolated environments "binary" to describe a CLI application that can be run directly from the command line Features Safely install packages to isolated virtual environments, while globally exposing their CLI applications so you can run them from anywhere Easily list, upgrade, and uninstall packages that were installed with pipx Run the latest version of a CLI application from a package in a temporary virtual environment, leaving your system untouched after it finishes Run binaries from the __pypackages__ directory per PEP 582 as companion tool to pythonloc Runs with regular user permissions, never calling sudo pip install ... (you aren't doing that, are you? 😄). You can globally install a CLI application by running: pipx install PACKAGE "Just the “pipx upgrade-all” command is already a huge win over pipsi" Check out How does this compare to pipsi? Brian #3: Data science is different now Vicki Boykis There’s lots of buzz around data science. This has resulted in loads of new data scientists looking for junior level positions. Coming from boot camps, MOOCs, self taught, remote degrees, and other training. “.. now that data science has changed from a buzzword to something even larger companies outside of the Silicon Valley bubble hire for, positions have not only become more codified, but with more rigorous entry requirements that will prefer people with previous data science experience every time.” “ … the market can be very hard, and very discouraging for the flood of beginners.” Data science is a misleading job req “The reality is that “data science” has never been as much about machine learning as it has about cleaning, shaping data, and moving it from place to place.” Advice: Don’t get into data science (this amuses me). “Don’t do what everyone else is doing, because it won’t differentiate you.” “It’s much easier to come into a data science and tech career through the “back door”, i.e. starting out as a junior developer, or in DevOps, project management, and, perhaps most relevant, as a data analyst, information manager, or similar, than it is to apply point-blank for the same 5 positions that everyone else is applying to. It will take longer, but at the same time as you’re working towards that data science job, you’re learning critical IT skills that will be important to you your entire career.” Learn the skills needed for data science today Creating Python packages Putting R in production Optimizing Spark jobs so they run more efficiently Version controlling data Making models and data reproducible Version controlling SQL Building and maintaining clean data in data lakes Tooling for time series forecasting at scale Scaling sharing of Jupyter notebooks Thinking about systems for clean data Lots of JSON Data science is turning more and more into a mostly engineering field. Data scientists need to have “good generalist engineering skills with a data background.” Michael #4: RustPython via Fredrik Averpil A Python-3 (CPython >= 3.5.0) Interpreter written in Rust. Seems pretty active: Latest commit ac95b61 an hour ago… Goals Full Python-3 environment entirely in Rust (not CPython bindings) A clean implementation without compatibility hacks Contributing To start contributing, there are a lot of things that need to be done. Most tasks are listed in the issue tracker. Check issues labeled with good first issue if you wish to start coding. Rust does have direct WebAssembly support… Brian #5: Jupyter Notebook: An Introduction Mike Driscoll on RealPython Not the “all the cool things you can do with it”, but the “really, how do I start” tutuorial. I think it should have included a mention of installing it in a venv and how to use %pip install, so I’ll include those things in these notes. Installing with pip install jupyter . Also a note that Jupyter is included with the Anaconda distribution. Note: Like everything else, I always install it in a virtual environment, if using pip, so the real installation instructions I recommend is: python3 -m venv venv --``prompt jupyter source venv/bin/activate OR venv\scripts\activate.bat if windows pip install jupyter pip install [HTML_REMOVED] jupyter notebook That will launch a localhost web interface. Creating a new notebook within the web interface. Changing the “Untitled” name by clicking on the name. This was not obvious to me. Running cells, including the shift-enter keyboard shortcut. A run through the menu, stopping at non-obvious places “File” has “Save and Checkpoint” which is super cool. “Edit” has cell cut, copy, paste. But also has delete, split, merge, and cell movement. “Cell” menu has lots of cool run options, like “Run all above” and “Run all below” and others. Not just Python, but you can have a terminal sessions and more from within Jupyter. A look at the “Running” tab. Quick overview of the markdown support for markdown cells Exporting notebooks using jupyter nbconvert Extra notes on installing packages from Jupyter: To pip install from the notebook, do this: %pip install numpy in a code cell. Michael #6: Python Developers Survey 2018 Results Python usage as a main language is up 5 percentage points from 79% in 2017 when Python Software Foundation conducted its previous survey. What do you use Python for? (2018/2017) 59%/51% Data analysis 56%/54% Web dev 39%/32% ML Web development is the only category with a large gap (56% vs. 36%) separating those using Python as their main language vs. as a supplementary language. For other types of development, the differences are much smaller. What do you use Python for the most? (single answer) 29%/29% web dev 17%/17% data analysis 11%/8% ML Like last year: 27% (Web development) ≈ 28% (Scientific development) Science = 17% + 11% for Data analysis + Machine learning Python 3 vs Python 2 84% Python 3 vs 16% Python 2. The use of Python 3 continues to grow rapidly. According to the latest research in 2017, 75% were using Python 3 compared with 25% for Python 2. Top 4 web frameworks (majority to the first two): Flask Django Tornado Pyramid Databases PostgreSQL MySQL SQLite MongoDB ORMs SQLAlchemy and Django ORM tied Extras: “Mentored sprints for diverse beginners” at PyCon “A newcomer’s introduction to contributing to an open source project” https://us.pycon.org/2019/hatchery/mentoredsprints/ Call for applications for projects open Feb 8 to March 14 Call for contributors, participants in the sprint also open Feb 8 to March 14 “If you are wondering if this event is for you: it definitely is and we would love to have you taking part in this sprint.” “This mentored sprint will take place on Saturday, May 4th, 2019 from 2:35pm to 6:30pm” Joke: via Florian Q: If you have some pseudo code (say in sample.txt) how do you most easily convert it to Python? A: Change the extension to .py Extra Joke: Python Song (with chapters!)
February 14, 2019
Sponsored by pythonbytes.fm/datadog Brian #1: Goodbye Virtual Environments? by Chad Smith venv’s are great but they introduce some problems as well: Learning curve: explaining “virtual environments” to people who just want to jump in and code is not always easy Terminal isolation: Virtual Environments are activated and deactivated on a per-terminal basis Cognitive overhead: Setting up, remembering installation location, activating/deactivating PEP 582 — Python local packages directory This PEP proposes to add to Python a mechanism to automatically recognize a __pypackages__directory and prefer importing packages installed in this location over user or global site-packages. This will avoid the steps to create, activate or deactivate “virtual environments”. Python will use the __pypackages__ from the base directory of the script when present. Try it now with pythonloc pythonloc is a drop in replacement for python and pip that automatically recognizes a __pypackages__ directory and prefers importing packages installed in this location over user or global site-packages. If you are familiar with node, __pypackages__ works similarly to node_modules. Instead of running python you run pythonloc and the __pypackages__ path will automatically be searched first for packages. And instead of running pip you run piploc and it will install/uninstall from __pypackages__. Michael #2: webassets Bundles and minifies CSS & JS files Been doing a lot of work to rank higher on the sites That lead me to Google’s Lighthouse Despite 25ms response time to the network, Google thought my site was “kinda slow”, yikes! webassets has integration for the big three: Django, Flask, & Pyramid. But I prefer to just generate them and serve them off disk def build_asset(env: webassets.Environment, files: List[str], filters: str, output: str): bundle = webassets.Bundle( *files, filters=filters, output=output, env=env ) bundle.build(force=True) Brian #3: Bernat on Python Packaging 3 part series by Bernat Gabor Maintainer of tox and virtualenv Python packages. The State of Python Packaging Python packaging - Past, Present, Future Python packaging - Growing Pains Michael #4: What the mock? — A cheatsheet for mocking in Python Nice introduction Some examples @mock.patch('work.os') def test_using_decorator(self, mocked_os): work_on() mocked_os.getcwd.assert_called_once() And def test_using_context_manager(self): with mock.patch('work.os') as mocked_os: work_on() mocked_os.getcwd.assert_called_once() Brian #5: Transitions: The easiest way to improve your tech talk By Saron Yitbarek Jeff Atwood of CodingHorror noted “The people who can write and communicate effectively are, all too often, the only people who get heard. They get to set the terms of the debate.” Effectively presenting is part of effective communication. I love the focus of this article. Focused on one little aspect of improving the performance of a tech talk. Michael #6: Steering council announced Our new leaders are Barry Warsaw Brett Cannon Carol Willing Guido van Rossum Nick Coghlan Via Joe Carey We both think it’s great Guido is on the council.
February 6, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: Inside python dict — an explorable explanation Interactive tutorial on dictionaries Searching efficiently in a list Why are hash tables called has tables? Putting it all together to make an “almost”-Python-dict How Python dict really works internally Yes this is a super deep dive, but wow it’s cool. Tons of the code is runnable right there in the web page, including moving visual representations, highlighted code with current line of code highlighted. Some examples allow you to edit values and play with stuff. Michael #2: Embed Python in Unreal Engine 4 You may notice a theme throughout my set of picks on this episode Games built on Unreal Engine 4 include Fortnite: Save the World Gears of War 4 Marvel vs. Capcom: Infinite Moto Racer 4 System Shock (remake) Plugin embedding a whole Python VM in Unreal Engine 4 (both the editor and runtime). This means you can use the plugin to write other plugins, to automate tasks, to write unit tests and to implement gameplay elements. Here is an example usage. It’s a really nice overview and tutorial for the editor. For game elements, check out this section. Brian #3: Redirecting stdout with contextlib When I want to test the stdout output of some code, that’s easy, I grab the capsys fixture from pytest. But what if you want to grab the stdout of a method NOT while testing? Enter [contextlib.redirect_stdout(new_target)](https://docs.python.org/3/library/contextlib.html#contextlib.redirect_stdout) so cool. And very easy to read. ex: f = io.StringIO() with redirect_stdout(f): help(pow) s = f.getvalue() also a version for stderr Michael #4: Panda3D via Kolja Lubitz Panda3D is an open-source, completely free-to-use engine for realtime 3D games, visualizations, simulations, experiments Not just games, could be science as well! The full power of the graphics card is exposed through an easy-to-use API. Panda3D combines the speed of C++ with the ease of use of Python to give you a fast rate of development without sacrificing on performance. Features: Platform Portability Flexible Asset Handling: Panda3D includes command-line tools for processing and optimizing source assets, allowing you to automate and script your content production pipeline to fit your exact needs. Library Bindings: Panda3D comes with out-of-the-box support for many popular third-party libraries, such as the Bullet physics engine, Assimp model loader, OpenAL Performance Profiling: Panda3D includes pstats — an over-the-network profiling system designed to help you understand where every single millisecond of your frame time goes. Brian #5: Why PyPI Doesn't Know Your Projects Dependencies Some questions you may have asked: > How can I produce a dependency graph for Python packages? > Why doesn’t PyPI show a project’s dependencies on it’s project page? > How can I get a project’s dependencies without downloading the package? > Can I search PyPI and filter out projects that have a certain dependency? If everything is in requirements.txt, you just might be able to, but… setup.py is dynamic. You gotta run it to see what’s needed. Dependencies might be environment specific. Windows vs Linux vs Mac, as an example. Nothing stopping someone from putting random.choice() for dependencies in a setup.py file. But that would be kinda evil. But could be done. (Listener homework?) The wheel format is way more predictable because it limits some of this freedom. wheels don’t get run when they install, they really just get unpacked. More info on wheels: Kind of a tangent, but what why not: From: https://pythonwheels.com “Advantages of wheels Faster installation for pure Python and native C extension packages. Avoids arbitrary code execution for installation. (Avoids setup.py) Installation of a C extension does not require a compiler on Linux, Windows or macOS. Allows better caching for testing and continuous integration. Creates .pyc files as part of installation to ensure they match the Python interpreter used. More consistent installs across platforms and machines.” Michael #6: PyGame series via Matthew Ward Learn how to program in Python by building a simple dice game Build a game framework with Python using the PyGame module How to add a player to your Python game Using PyGame to move your game character around What's a hero without a villain? How to add one to your Python game Put platforms in a Python game with PyGame Also: Shout out to Mission Python book: Code a Space Adventure Game!
February 2, 2019
Sponsored by pythonbytes.fm/datadog Special guest: Nina Zakharenko Brian #1: Great Expectations A set of tools intended for batch time testing of data pipeline data. Introduction to the problem doc: Down with Pipeline debt / Introducing Great Expectations expect_[something]() methods that return json formatted descriptions of whether or not the passed in data matches your expectations. Can be used programmatically or interactively in a notebook. (video demo). For programmatic use, I’m assuming you have to put code in place to stop a pipeline stage if expectations aren’t met, and write failing json result to a log or something. Examples, just a few, full list is big: Table shape: expect_column_to_exist, expect_table_row_count_to_equal Missing values, unique values, and types: - expect_column_values_to_be_unique, expect_column_values_to_not_be_null Sets and ranges expect_column_values_to_be_in_set String matching expect_column_values_to_match_regex Datetime and JSON parsing Aggregate functions expect_column_stdev_to_be_between Column pairs Distributional functions expect_column_chisquare_test_p_value_to_be_greater_than Nina #2: Using CircuitPython and MicroPython to write Python for wearable electronics and embedded platforms I’ve been playing with electronics projects as a hobby for the past two years, and a few months ago turned my attention to Python on microcontrollers MicroPython is a lean and efficient implementation of Python3 that can run on microcontrollers with just 256k of code space, and 16k of RAM. CircuitPython is a port of MicroPython, optimized for Adafruit devices. Some of the devices that run Python are as small as a quarter. My favorite Python hardware platform for beginners is Adafruit’s Circuit PlayGround Express. It has everything you need to get started with programming hardware without soldering. All you’ll need is alligator clips for the conductive pads. The board features NeoPixel LEDs, buttons, switches, temperature, motion, and sound sensors, a tiny speaker, and lots more. You can even use it to control servos, tiny motor arms. Best of all, it only costs $25. If you want to program the Circuit PlayGround Express with a drag-n-drop style scratch-like interface, you can use Microsoft’s MakeCode. It’s perfect for kids and you’ll find lots of examples on their site. Best of all, there are tons of guides for Python projects to build on their website, from making your own synthesizers, to jewelry, to silly little robots. Check out the repo for my Python-powered earrings, see a photo, or a demo. Sign up for the Adafruit Python for Microcontrollers mailing list here, or see the archives here. Michael #3: Data class CSV reader Map CSV to Data Classes You probably know about reading CSV files Maybe as tuples Better with csv.DictReader This library is similar but maps Python 3.7’s data classes to rows of CSV files Includes type conversions (say string to int) Automatic type conversion. DataclassReader supports str, int, float, complex and datetime DataclassReader use the type annotation to perform validation of the data of the CSV file. Helps you troubleshoot issues with the data in the CSV file. DataclassReader will show exactly in which line of the CSV file contain errors. Extract only the data you need. It will only parse the properties defined in the dataclass It uses dataclass features that let you define metadata properties so the data can be parsed exactly the way you want. Make the code cleaner. No more extra loops to convert data to the correct type, perform validation, set default values, the DataclassReader will do all this for you Default fallback values, more. Brian #4: How to Rock Python Packaging with Poetry and Briefcase Starts with a discussion of the packaging (for those readers that don’t listen to Python Bytes, I guess.) However, it also puts flit, pipenv, and poetry in context with each other, which is nice. Runs through a tutorial of how to build a pyproject.toml based project using poetry and briefcase. We’ve talked about Poetry before, on episode 100. pyproject.toml is discussed extensively on Test & Code 52. briefcase is new, though, it’s a project for creating standalone native applications for Mac, Windows, Linux, iOS, Android, and more. The tutorial also discusses using poetry directly to publish to the test-pypi server. This is a nice touch. Use the test-pypi before pushing to the real pypi. Very cool. Nina #5: awesome-python-security *🕶🐍🔐, a collection of tools, techniques, and resources to make your Python more secure* All of your production and client-facing code should be written with security in mind This list features a few resources I’ve heard of such as Anthony Shaw’s excellent 10 common security gotchas article which highlights problems like input injection and depending on assert statements in production, and a few that are new to me: OWASP (Open Web Application Security Project) Python Resources at pythonsecurity.org bandit a tool to find common security issues in Python bandit features a lot of useful plugins, that test for issues like: hardcoded password strings leaving flask debug on in production using exec() in your code & more detect-secrets, a tool to detect secrets left accidentally in a Python codebase & lots more like resources for learning about security concepts like cryptography See the full list for more Michael #6: pydbg Python implementation of the Rust dbg macro Best seen with an example. Rather than printing things you want to inspect, you: a = 2 b = 3 dbg(a+b) def square(x: int) -> int: return x * x dbg(square(a)) outputs: [testfile.py:4] a+b = 5 [testfile.py:9] square(a) = 4 Extras: Brian: pathlib + pytest tmpdir → tmp_path & tmp_path_factory https://docs.pytest.org/en/latest/tmpdir.html These two new fixtures (as of pytest 3.9) act like the good old tmpdir and tmpdir_factory, but return pathlib Path objects. Awesome. Michael: The Art of Python is a miniature arts festival at PyCon North America 2019, focusing on narrative, performance, and visual art. We intend to encourage and showcase novel art that helps us share our emotionally charged experiences of programming (particularly in Python). We hope that by attending, our audience will discover new aspects of empathy and rapport, and find a different kind of delight and perspective than might otherwise be expected at a large conference. StackOverflow Survey is Open! https://stackoverflow.az1.qualtrics.com/jfe/form/SV_1RGiufc1FCJcL6B NumPy Is Awaiting Fix for Critical Remote Code Execution Bug via Doug Sheehan The issue was raised on January 16 and affects NumPy versions 1.10 (released in 2015) through 1.16, which is the latest release at the moment, released on January 14 The problem is with the 'pickle' module, which is used for transforming Python object structures into a format that can be stored on disk or in databases, or that allows delivery across a network. The issue was reported by security researcher Sherwel Nan, who says that if a Python application loads malicious data via the numpy.load function an attacker can obtain remote code execution on the machine. Get your google data All google docs in MS Office format via https://takeout.google.com/settings/takeout All Gmail in MBOX format from there as well Hint: Start with nothing selected ;) Nina: I’m teaching a two day Intro and Intermediate Python course on March 19th and 20th. The class will live-stream for free here on each day of or join in-person from downtown Minneapolis. All of the course materials will be released for free as well. I recently recorded a series of videos with Carlton Gibson (Django maintainer) on developing Django Web Apps with VS Code, deploying them to Azure with a few clicks, setting up a Continuous Integration / Continuous Delivery pipeline, and creating serverless apps. Watch the series here: https://aka.ms/python-videos I’ll be a mentor at a brand new hatchery event at PyCon US 2019, mentored sprints for diverse beginners organized by Tania Allard. The goal is to help underrepresented folks at PyCon contribute to open source in a supportive environment. The details will be located here (currently a placeholder) when they’re finalized. Catch my talk about electronics projects in Python with LEDs at PyCascades in Seattle on February 24th. Currently tickets are still for sale. If you haven’t tried the Python extension for VS Code, now is a great time. The December release included some killer features, such as remote Jupyter support, and exporting Python files as Jupyter notebooks. Keep up with future releases at the Python at Microsoft blog.
January 26, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: What should be in the Python standard library? on lwn.net by Jake Edge There was a discussion recently about what should be in the standard library, triggered by a request to add LZ4 compression. Kinda hard to summarize but we’ll try: Jonathan Underwood proposed adding LZ4 compression to stdlib. Can of worms opened zlib and bz2 already in stdlib Brett proposed making something similar to hashlib for compression algorithms. Against adding it: lz4 not needed for stdlib, and actually, bz2 isn’t either, but it’s kinda late to remove. PyPI is easy enough. put stuff there. Led to a discussion of the role of stdlib. If it’s batteries included, shouldn’t we add new batteries Some people don’t have access to PyPI easily Do we never remove elements? really? Maybe we should have a lean stdlib and a thicker standard distribution of selected packages who would decide? same problem exists then of depending on it. How to remove stuff? Steve Dower would rather see a smaller standard library with some kind of "standard distribution" of PyPI modules that is curated by the core developers. A leaner stdlib could speed up Python version schedules and reduce burden on core devs to maintain seldom used packages. See? can of worms. In any case, all this would require a PEP, so we have to wait until we have a PEP process decided on. Michael #2: Data Science portal for Home Assistant launched via Paul Cutler Home Assistant is launching a data science portal to teach you how you can learn from your own smart home data. In 15 minutes you setup a local data science environment running reports. A core principle of Home Assistant is that a user has complete ownership of their personal data. A users data lives locally, typically on the SD card in their Raspberry Pi The Home Assistant Data Science website is your one-stop-shop for advice on getting started doing data science with your Home Assistant data. To accompany the website, we have created a brand new Hass.io Add-on JupyterLab lite, which allows you to run a data science IDE called JupyterLab directly on your Raspberry Pi hosting Home Assistant. You do your data analysis locally, your data never leaves your local machine. When you build something cool, you can share the notebook without the results, so people can run it at their homes too. We have also created a Python library called the HASS-Data-Detective which makes it super easy to get started investigating your Home Assistant data using modern data science tools such as Pandas. Check out the Getting Started notebook IoT aside: I finally found my first IoT project: Recording in progress button. Brian #3: What's the future of the pandas library? Kevin Markham over at dataschool.io pandas is gearing up to move towards a 1.0 release. Currently rc-ing 0.24 Plans are to get there “early 2019”. Some highlights method chaining - encouraged by core team to encourage further, more methods will support chaining Apache arrow likely to be part of pandas backend sometime after 1.0 Extension arrays - allow you to create custom data types deprications inplace parameter. It doesn’t work with chaining, doesn’t actually prevent copies, and causes codebase complexity ix accessor, use loc and iloc instead Panel data structure. Use MultiIndex instead SparseDataFrame. Just use a normal DataFrame legacy python support Michael #4: PyOxidizer PyOxidizer is a collection of Rust crates that facilitate building libraries and binaries containing Python interpreters. PyOxidizer is capable of producing a single file executable - with all dependencies statically linked and all resources (like .pyc files) embedded in the executable The Oxidizer part of the name comes from Rust: executables produced by PyOxidizer are compiled from Rust and Rust code is responsible for managing the embedded Python interpreter and all its operations. PyOxidizer is similar in nature to PyInstaller, Shiv, and other tools in this space. What generally sets PyOxidizer apart is Produced executables contain an embedded, statically-linked Python interpreter have no additional run-time dependency on the target system runs everything from memory (as opposed to e.g. extracting Python modules to a temporary directory and loading them from there). Brian #5: Working With Files in Python by Vuyisile Ndlovu on RealPython Very comprehensive write up on working with files and directories Includes legacy and modern methods. Pay attention to pathlib parts if you are using 3.4 plus Also great for “if you used to do x, here’s how to do it with pathlib”. Included: Directory listings getting file attributes creating directories file name pattern matching traversing directories doing stuff with the files in there creating temp directories and files deleting, copying, moving, renaming archiving with zip and tar including reading those looping over files Michael #6: $ python == $ python3? via David Furphy Homebrew tried this recently & got "persuaded" to reverse. Also in recent discussion of edits to PEP394, GvR said absolutely not now, probably not ever. Guido van Rossum RE: python doesn’t exist on macOS as a command: Did you mean python2 there? In my experience macOS comes with python installed (and invoking Python 2) but no python2 link (hard or soft). In any case I'm not sure how this strengthens your argument. I'm also still unhappy with any kind of endorsement of python pointing to python3. When a user gets bitten by this they should receive an apology from whoever changed that link, not a haughty "the PEP endorses this". Regardless of what macOS does I think I would be happier in a future where python doesn't exist and one always has to specify python2 or python3. Quite possibly there will be an age where Python 2, 3 and 4 all overlap, and EIBTI. Extras: Michael: A letter to the Python community in Africa via Anthony Shaw Believe the broader international Python and Software community can learn a lot from what so many amazing people are doing across Africa. e.g. The attendance of PyCon NA was 50% male and 50% female. Joke: via Luke Russell: A: “Knock Knock” B: “Who’s There" A: ……………………………………………………………………………………….“Java” Also: Java 4EVER video is amazing: youtube.com/watch?v=kLO1djacsfg
January 18, 2019
Sponsored by https://pythonbytes.fm/digitalocean Brian #1: Advent of Code 2018 Solutions Michael Fogleman Even if you didn’t have time or energy to do the 2018 AoC, you can learn from other peoples solutions. Here’s one set written up in a nice blog post. Michael #2: Python Lands on the Windows 10 App Store Python Software Foundation recently released Python 3.7 as an app on the official Windows 10 app store. Python 3.7 is now available to install from the Microsoft Store, meaning you no longer need to manually download and install the app from the official Python website. there is one limitation. “Because of restrictions on Microsoft Store apps, Python scripts may not have full write access to shared locations such as TEMP and the registry. Discussed with Steve Dower over on Talk Python 191 Brian #3: How I Built A Python Web Framework And Became An Open Source Maintainer Florimond Manca Bocadillo - “A modern Python web framework filled with asynchronous salsa” ”maintaining an open source project is a marathon, not a sprint.” Tips at the end of the article include tips for the following topics, including recommendations and tool choices: Project definition Marketing & Communication Community Project management Code quality Documentation Versioning and releasing Michael #4: Python maintainability score via Wily via Anthony Shaw A Python application for tracking, reporting on timing and complexity in tests Easiest way to calculate it is with wily https://github.com/tonybaloney/wily … the metrics are ‘maintainability.mi’ and ‘maintainability.rank’ for a numeric and the A-F scale. Build an index: wily build src Inspect report: wily report file Graph: wily graph file metric Brian #5: A couple fun awesome lists Awesome Python Security resources Tools web framework hardening, ex: secure.py multi tools static code analysis, ex: bandit vulnerabilities and security advisories cryptography app templates Education lots of resources for learning Companies Awesome Flake8 Extensions clean code testing, including flake8-pytest - Enforces to use pytest-style assertions flake8-mock - Provides checking mock non-existent methods security documentation enhancements copyrights Michael #6: fastlogging via Robert Young A faster replacement of the standard logging module with a mostly compatible API. For a single log file it is ~5x faster and for rotating log file ~13x faster. It comes with the following features: (colored, if colorama is installed) logging to console logging to file (maximum file size with rotating/history feature can be configured) old log files can be compressed (the compression algorithm can be configured) count same successive messages within a 30s time frame and log only once the message with the counted value. log domains log to different files writing to log files is done in (per file) background threads, if configured configure callback function for custom detection of same successive log messages configure callback function for custom message formatter configure callback function for custom log writer >> import antigravity
January 11, 2019
Sponsored by https://pythonbytes.fm/datadog Brian #1: nbgrader nbgrader: A Tool for Creating and Grading Assignments in the Jupyter Notebook The Journal of Open Source Education, paper accepted 6-Jan-2019 nbgrader documentation, including a intro video From the JOSE article: “nbgrader is a flexible tool for creating and grading assignments in the Jupyter Notebook (Kluyver et al., 2016). nbgrader allows instructors to create a single, master copy of an assignment, including tests and canonical solutions. From the master copy, a student version is generated without the solutions, thus obviating the need to maintain two separate versions. nbgrader also automatically grades submitted assignments by executing the notebooks and storing the results of the tests in a database. After auto-grading, instructors can manually grade free responses and provide partial credit using the formgrader Jupyter Notebook extension. Finally, instructors can use nbgrader to leave personalized feedback for each student’s submission, including comments as well as detailed error information.” CS teaching methods have come a long ways since I was turning in floppies and code printouts. Michael #2: profanity-check A fast, robust Python library to check for offensive language in strings. profanity-check uses a linear SVM model trained on 200k human-labeled samples of clean and profane text strings. Making profanity-check both robust and extremely performant Other libraries like profanity-filter use more sophisticated methods that are much more accurate but at the cost of performance. profanity-filter runs in 13,000ms vs 24ms for profanity-check in a benchmark Two ways to use: predict(text) → 0 or 1 (1 = bad) predict_prob(text) → [0, 1] confidence interval (1 = bad) Brian #3: An Introduction to Python Packages for Absolute Beginners Ever tried to explain the difference between module and package? Between package-in-the-directory-with-init sense and package-you-can-distribute-and-install-with-pip sense? Here’s the article to read beforehand. Modules, packages, using packages, installing, importing, and more. And that’s not even getting into flit and poetry, etc. But it’s a good place to start for people new to Python. Michael #4: Python Dependencies and IoC via Joscha Götzer Open-closed principle is at work with these and is super valuable to testing (one of the SOLID principles): Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification. There is a huge debate around why Python doesn’t need DI or Inversion of Control (IoC), and a quick stackoverflow search yields multiple results along the lines of “python is a scripting language and dynamic enough so that DI/IoC makes no sense”. However, especially in large projects it might reduce the cognitive load and decoupling of individual components Dependency Injector: I couldn’t get this one to work on windows, as it needs to compile some C libraries and some Visual Studio tooling was missing that I couldn’t really install properly. The library looks quite promising though, but sort of static with heavy usage of containers and not necessarily pythonic. Injector: The library that above mentioned article talks about, a little Java-esque pinject: Has been unmaintained for about 5 years, and only recently got new attention from some open source people who try to port it to python3. A product under Google copyright, and looks quite nice despite the lack of python3 bindings. Probably the most feature-rich of the listed libraries. python-inject: I discovered that one while writing this email, not really sure if it’s any good. Nice use of type annotations and testing features di-py: Only works up to python 3.4, so I’ve also never tried it (I’m one of those legacy python haters, I’m sure you can relate 😄). Serum: This one is a little too explicit to my mind. It makes heavy use of context managers (literally with Context(...): everywhere 😉) and I’m not immediately sure how to work with it. In this way, it is quite powerful though. Interesting use of class decorators. And now on to my favorite and a repeated recommendation of mine around the internet→ Haps: This lesser-known, lightweight library is sort of the new kid on the block, and really simple to use. As some of the other libraries, it uses type annotations to determine the kind of object it is supposed to instantiate, and automatically discovers the required files in your project folder. Haps is very pythonic and fits into apps of any size, helping to ensure modularization as the only dependency of your modules will be one of the types provided by the library. Pretty good example here. Brian #5: A Gentle Introduction to Pandas Really a gentle introduction to the Pandas data structures Series and DataFrame. Very gentle, with console examples. Create series objects: from an array from an array, and change the indexing from a dictionaries from a scalar, cool. didn’t know you could do that Accessing elements in a series DataFrames sorting, slicing selecting by label, position statistics on columns importing and exporting data Michael #6: Don't use the greater than sign in programming One simple thing that comes up time and time again is the use of the greater than sign as part of a conditional while programming. Removing it cleans up code. Let's say that I want to check that something is between 5 and 10. There are many ways I can do this x > 5 and 10 > x 5 < x and 10 > x x > 5 and x < 10 10 < x and x < 5 x < 10 and x > 5 x < 10 and 5 < x Sorry, one of those is incorrect. Go ahead and find out which one If you remove the use of the greater than sign then only 2 options remain x < 10 and 5 < x 5 < x and x < 10 The last is nice because x is literally between 5 and 10 There is also a nice way of expressing that "x is outside the limits of 5 and 10” x < 5 or 10 < x Again, this expresses it nicely because x is literally outside of 5 to 10. Interesting comment: What is cleaner or easier to read comes down to personal taste. But how to express "all numbers greater than 1" without '>'? ans: 1 < allNumbers Extras Michael Teaching Python podcast by Kelly Paredes & Sean Tibor Github private repos (now free) EuroPython 2019 announced South African AWS Data Center coming (via William H.) Pandas is dropping legacy Python support any day now Joke: Harry Potter Parser Tongue via Nick Spirit
January 5, 2019
Sponsored by https://pythonbytes.fm/datadog Brian #1: loguru: Python logging made (stupidly) simple Finally, a logging interface that is just slightly more syntax than print to do mostly the right thing, and all that fancy stuff like log rotation is easy to figure out. i.e. a logging API that fits in my brain. bonus: README is a nice tour of features with examples. Features: Ready to use out of the box without boilerplate No Handler, no Formatter, no Filter: one function to rule them all Easier file logging with rotation / retention / compression Modern string formatting using braces style Exceptions catching within threads or main Pretty logging with colors Asynchronous, Thread-safe, Multiprocess-safe Fully descriptive exceptions Structured logging as needed Lazy evaluation of expensive functions Customizable levels Better datetime handling Suitable for scripts and libraries Entirely compatible with standard logging Personalizable defaults through environment variables Convenient parser Exhaustive notifier Michael #2: Python gets a new governance model by Brett Canon July 2018, Guido steps down Python progress has basically been on hold since then ended up with 7 governance proposals Voting was open to all core developers as we couldn't come up with a reasonable criteria that we all agreed to as to what defined an "active" core dev And the winner is ... In the end PEP 8016, the steering council proposal, won. it was a decisive win against second place PEP 8016 is heavily modeled on the Django project's organization (to the point that the PEP had stuff copy-and-pasted from the original Django governance proposal). What it establishes is a steering council of five people who are to determine how to run the Python project. Short of not being able to influence how the council itself is elected (which includes how the electorate is selected), the council has absolute power. result of the vote prevents us from ever having the Python project be leaderless again, it doesn't directly solve how to guide the language's design. What's next? The next step is we elect the council. It's looking like nominations will be from Monday, January 07 to Sunday, January 20 and voting from Monday, January 21 to Sunday, February 03 A key point I hope people understand is that while we solved the issue of project management that stemmed from Guido's retirement, the council will need to be given some time to solve the other issue of how to manage the design of Python itself. Brian #3: Why you should be using pathlib Tour of pathlib from Trey Hunner pathlib combines most of the commonly used file and directory operations from os, os.path, and glob. uses objects instead of strings as of Python 3.6, many parts of stdlib support pathlib since pathlib.Path methods return Path objects, chaining is possible convert back to strings if you really need to for pre-3.6 code Examples: make a directory: Path('src/__pypackages__').mkdir(parents=True, exist_ok=True) rename a file: Path('.editorconfig').rename('src/.editorconfig') find some files: top_level_csv_files = Path.cwd().glob('*.csv') recursively: all_csv_files = Path.cwd().rglob('*.csv') read a file: Path('some/file').read_text() write to a file: Path('.editorconfig').write_text('# config goes here') with open(path, mode) as x works with Path objects as of 3.6 Follow up article by Trey: No really, pathlib is great Michael #4: Altair and Altair Recipes via Antonio Piccolboni (he wrote altair_recipes) Altair: Declarative statistical visualization library for Python Altair is developed by Jake Vanderplas and Brian Granger By statistical visualization they mean: The data source is a DataFrame that consists of columns of different data types (quantitative, ordinal, nominal and date/time). The DataFrame is in a tidy format where the rows correspond to samples and the columns correspond to the observed variables. The data is mapped to the visual properties (position, color, size, shape, faceting, etc.) using the group-by data transformation. Nice example that I can get behind # cars = some Pandas data frame alt.Chart(cars).mark_point().encode( x='Horsepower', y='Miles_per_Gallon', color='Origin', ) altair_recipes Altair allows generating a wide variety of statistical graphics in a concise language, but lacks, by design, pre-cooked and ready to eat statistical graphics, like the boxplot or the histogram. Examples: https://altair-recipes.readthedocs.io/en/latest/examples.html They take a few lines only in altair, but I think they deserve to be one-liners. altair_recipes provides that level on top of altair. The idea is not to provide a multitude of creative plots with fantasy names (the way seaborn does) but a solid collection of classics that everyone understands and cover most major use cases: the scatter plot, the boxplot, the histogram etc. Fully documented, highly consistent API (see next package), 90%+ test coverage, maintainability grade A, this is professional stuff if I may say so myself. Brian #5: A couple fun pytest plugins pytest-picked Using git status, this plugin allows you to: Run only tests from modified test files Run tests from modified test files first, followed by all unmodified tests Kinda hard to overstate the usefulness of this plugin to anyone developing or debugging a test. Very, very cool. pytest-clarity Colorized left/right comparisons Early in development, but already helpful. I recommend running it with -qq if you don’t normally run with -v/--verbose since it overrides the verbosity currently. Michael #6: Secure 🔒 headers and cookies for Python web frameworks Python package called Secure, which sets security headers and cookies (as a start) for Python web frameworks. I was listening to the Talk Python To Me episode “Flask goes 1.0” with Flask maintainer David Lord. At the end of the interview he was asked about notable PyPI packages and spoke about Flask-Talisman, a third-party package to set security headers in Flask. As a security professional, it was surprising and encouraging to hear the maintainer of the most popular Python web framework speak passionately about a security package. Had been recently experimenting with emerging Python web frameworks and realized there was a gap in security packages. That inspired Caleb to (humbly) see if it were possible to make a package to correct that and I started with Responder and then expanded to support more frameworks. The outcome was Secure with functions to support aiohttp, Bottle, CherryPy, Falcon, hug, Pyramid, Quart, Responder, Sanic, Starlette and Tornado (most of these, if not all have been featured on Talk Python) and can also be utilized by frameworks not officially supported. The goal is to be minimalistic, lightweight and be implemented in a way that does not disrupt an individual framework’s design. I have had some great feedback and suggestions from the developer and OWASP community, including some awesome discussions with the OWASP Secure Project and the Sanic core team. Added support for Flask and Django too. Secure Cookies is nice in the mix Extras: Michael: SQLite bug impacts thousands of apps, including all Chromium-based browsers See https://twitter.com/mborus/status/1080874700924964864 Since this bug is triggered by an SQL command, general CPython usage should not be affected, and long as you don’t run arbitrary SQL-commands provided by the outside. Seems to NOT be a problem in CPython: https://twitter.com/mborus/status/1080883549308362753 Michael: Follow up to our AI and healthcare conversation via Bradley Hintze I found your discussion of deep learning in healthcare interesting, no doubt because that is my area. I am the data scientist for the National Oncology Program at the Veterans Health Administration. I work directly with clinicians and it is my strong opinion that AI cannot take the job from the MD. It will however make caring for patients much more efficient as AI takes care of the low hanging fruit, it you will. Healthcare, believe it or not, is a science and an art. This is why AI is never going to make doctors obsolete. It will, however, make doctors more efficient and demanded a more sophisticated doctor -- one that understands AI enough to not only trust it but, crucially, comprehend its limits. Michael: Upgrade to Python 3.7.2 If you install via home brew, it’s time for brew update && brew upgrade Michael: New course! Introduction to Ansible
December 26, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean This episode originally aired on Talk Python at talkpython.fm/192. It's been a fantastic year for Python. Literally, every year is better than the last with so much growth and excitement in the Python space. That's why I've asked two of my knowledgeable Python friends, Dan Bader and Brian Okken, to help pick the top 10 stories from the Python community for 2018. Guests Brian Okken @brianokken Dan Bader @dbader_org 10: Python 3.7: Cool New Features in Python 3.7 9: Changes in versioning patterns ZeroVer: 0-based Versioning Calendar Versioning Semantic Versioning 2.0.0 8: Python is becoming the world’s most popular coding language Economist article 7: 2018 was the year data science Pythonistas == web dev Pythonistas Python Developers Survey Results Covered in depth on Talk Python 176 6: Black Project Soundgarden : “Black Hole Sun” 5: New PyPI launched! Python Package Index 4: Rise of Python in the embedded world Covered at Python Bytes 3: Legacy Python's days are fading? Python 2.7 -- bugfix or security before EOL? Python 2 death clockhttps://pythonclock.org/ 2: It's the end of innocence for PyPi Twelve malicious Python libraries found and removed from PyPI 1: Guido stepped down as BDFL python-committers: Transfer of power Proposals for new governance structure
December 18, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Python Descriptors Are Magical Creatures an excellent discussion of understanding @property and Python’s descriptor protocol. discussion includes getter, setter, and deleter methods you can override. Michael #2: Data Science Survey 2018 JetBrains JetBrains polled over 1,600 people involved in Data Science and based in the US, Europe, Japan, and China, in order to gain insight into how this industry sector is evolving Key Takeaways Most people assume that Python will remain the primary programming language in the field for the next 5 years. Python is currently the most popular language among data scientists. Data Science professionals tend to use Keras and Tableau, while amateur data scientists are more likely to prefer Microsoft Azure ML. Most common activities among pros and amateurs: Data processing Data visualization Main programming language for data analysis Python 57% R 15% Julia 0% IDEs and Editors Jupyter 43% PyCharm 38% RStudio 23% … Brian #3: cache.py cache.py is a one file python library that extends memoization across runs using a cache file. memoization is an incredibly useful technique that many self taught or on the job taught developers don’t know about, because it’s not obvious. example: import cache @cache.cache() def expensive_func(arg, kwarg=None): # Expensive stuff here return arg The @cache.cache() function can take multiple arguments. @cache.cache(timeout=20) - Only caches the function for 20 seconds. @cache.cache(fname="my_cache.pkl") - Saves cache to a custom filename (defaults to hidden file .cache.pkl) @cache.cache(key=cache.ARGS[KWARGS,NONE]) - Check against args, kwargs or neither of them when doing a cache lookup. Michael #4: Setting up the data science tools part of a larger video series set up. Tools to keras ultimately Tools anaconda tensorflow Jupyter Keras good for true beginners setup and activate a condo venv Start up a notebook and switch envs use conda, rather than pip Brian #5: chartify “Python library that makes it easy for data scientists to create charts.” from the docs: Consistent input data format: Spend less time transforming data to get your charts to work. All plotting functions use a consistent tidy input data format. Smart default styles: Create pretty charts with very little customization required. Simple API: We've attempted to make to the API as intuitive and easy to learn as possible. Flexibility: Chartify is built on top of Bokeh, so if you do need more control you can always fall back on Bokeh's API. Michael #6: CPython byte code explorer JupyterLab extension to inspect Python Bytecode via Anton Helm by Jeremy Tuloup You’ll see exactly what it’s about if you watch the GIF movie at the github repo. Can’t think of a better way to understand Python bytecode quickly than to play a little with this Comparing versions of CPython: If you have several versions of Python installed on your machine (let's say in different conda environments), you can use the extension to check how the bytecode might differ. Nice visualization of different performance aspects of while vs. for at the end
December 11, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: pyjanitor - for cleaning data originally a port of an R package called janitor, now much more. “pyjanitor’s etymology has a two-fold relationship to “cleanliness”. Firstly, it’s about extending Pandas with convenient data cleaning routines. Secondly, it’s about providing a cleaner, method-chaining, verb-based API for common pandas routines.” functionality: Cleaning columns name (multi-indexes are possible!) Removing empty rows and columns Identifying duplicate entries Encoding columns as categorical Splitting your data into features and targets (for machine learning) Adding, removing, and renaming columns Coalesce multiple columns into a single column Convert excel date (serial format) into a Python datetime format Expand a single column that has delimited, categorical values into dummy-encoded variables This pandas code: df = pd.DataFrame(...) # create a pandas DataFrame somehow. del df['column1'] # delete a column from the dataframe. df = df.dropna(subset=['column2', 'column3']) # drop rows that have empty values in column 2 and 3. df = df.rename({'column2': 'unicorns', 'column3': 'dragons'}) # rename column2 and column3 df['newcolumn'] = ['iterable', 'of', 'items'] # add a new column. - looks like this with pyjanitor: df = ( pd.DataFrame(...) .remove_columns(['column1']) .dropna(subset=['column2', 'column3']) .rename_column('column2', 'unicorns') .rename_column('column3', 'dragons') .add_column('newcolumn', ['iterable', 'of', 'items']) ) Michael #2: What Does It Take To Be An Expert At Python? Presentation at PyData 2017 by James Powell Covers Python Data Model (dunder methods) Covers uses of Metaclasses All done very smoothly as a series of demos Pretty long and in depth, 1.5+ hours Brian #3: Awesome Python Applications pypi is a great place to find great packages you can use as examples for the packages you write. Where do you go for application examples? Well, now you can go to Awesome Python Applications. categories of applications included: internet, audio, video, graphics, games, productivity, organization, communication, education, science, CMS, ERP (enterprise resource planning), static site generators, and a whole slew of developer related applications. Mahmoud is happy to have help filling this out, so if you know of a great open source application written in Python, go ahead and contribute to this, or open an issue on this project. Michael #4: Django Core no more Write up by James Bennett If you’re not the sort of person who closely follows the internals of Django’s development, you might not know there’s a draft proposal to drastically change the project’s governance. What’s up: Django the open-source project is OK right now, but difficulty in recruiting and retaining enough active contributors. Some of the biggest open-source projects dodge this by having, effectively, corporate sponsorship of contributions. Django has become sort of a victim of its own success: the types of easy bugfixes and small features that often are the path to growing new committers have mostly been done already in Django. Not managed to bring in new committers at a sufficient rate to replace those who’ve become less active or even entirely inactive, and that’s not sustainable for much longer. Under-attracting women contributors too Governance: Some parallels to what the Python core devs are experiencing now. Project leads BDFLs stepped down. The proposal: what I’ve proposed is the dissolution of “Django core”, and the revocation of almost all commit bits Seems extreme but they were working much more as a team with PRs, etc anyway. Breaks down the barrier to needing to be on the core team to suggest, change anything. Two roles would be formalized — Mergers and Releasers — who would, respectively, merge pull requests into Django, and package/publish releases. But rather than being all-powerful decision-makers, these would be bureaucratic roles Brian #5: wemake django template a cookie-cutter template for serious django projects with lots of fun goodies “This project is used to scaffold a django project structure. Just like django-admin.py startproject but better.” features: Always up-to-date with the help of [@dependabot](https://dependabot.com/) poetry for managing dependencies mypy for optional static typing pytest for unit testing flake8 and wemake-python-styleguide for linting pre-commit hooks for consistent development docker for development, testing, and production sphinx for documentation Gitlab CI with full build, test, and deploy pipeline configured by default Caddy with https and http/2 turned on by default Michael #6: Django Hunter Tool designed to help identify incorrectly configured Django applications that are exposing sensitive information. Why? March 2018: 28,165 thousand django servers are exposed on the internet, many are showing secret API keys, database passwords, amazon AWS keys. Example: https://twitter.com/6IX7ine/status/978598496658960384 Some complained this inferred Django was insecure and said it wasn’t. Others thought “There is a reasonable argument to be made that DEBUG should default to False.” One beginner, Peter, chimes in: I probably have one of them, among my early projects that are on heroku and public GitHub repos. I did accidentally expose my aws password this way and all hell broke loose. The problem is that as a beginner, it wasn't obvious to me how to separate development and production settings and keep production stuff out of my public repository. Extras: Michael: Thanks for having me on your show Brian: https://blog.michaelckennedy.net/2018/12/08/being-a-great-podcast-guest/ Brian: open source extra: For Christmas, I want a dragon… pic.twitter.com/RmFAEgqpSr — Changelog (@changelog) Michael: Why did the multithreaded chicken cross the road? road the side get to the other of to to get the side to road the of other the side of to the to road other get to of the road to side other the get
December 7, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: glom: restructuring data, the Python way glom is a new approach to working with data in Python, featuring: Path-based access for nested structure data\['a'\]['b']['c'] → glom(data, 'a.b.c') Declarative data transformation using lightweight, Pythonic specifications glom(target, spec, **kwargs) with options such as a default value if value not found allowed exceptions Readable, meaningful error messages: PathAccessError: could not access 'c', part 2 of Path('a', 'b', 'c') is better than TypeError: 'NoneType' object is not subscriptable Built-in data exploration and debugging features glom.Inspect(``**a*``, ***kw*``) The [**Inspect**](https://glom.readthedocs.io/en/latest/api.html#glom.Inspect) specifier type provides a way to get visibility into glom’s evaluation of a specification, enabling debugging of those tricky problems that may arise with unexpected data. Michael #2: Scientific GUI apps with TraitsUI via Franklin Ventura They support: PyQt, wxPython, PySide, PyQt5 People should be aware of and when combined with Chaco (again from Enthought) the graphing and controlling capabilities really are amazing. Tutorial: Writing a graphical application for scientific programming using TraitsUI 6.0 Really simple UI / API for mapping object(s) to GUIs and back. Brian #3: Pampy: The Pattern Matching for Python you always dreamed of “Pampy is pretty small (150 lines), reasonably fast, and often makes your code more readable and hence easier to reason about.” uses _ as the missing info in a pattern simple match signature of match(input, pattern, action) Examples nested lists and tuples from pampy import match, _ x = [1, [2, 3], 4] match(x, [1, [_, 3], _], lambda a, b: [1, [a, 3], b]) # => [1, [2, 3], 4] - dicts: pet = { 'type': 'dog', 'details': { 'age': 3 } } match(pet, { 'details': { 'age': _ } }, lambda age: age) # => 3 match(pet, { _ : { 'age': _ } }, lambda a, b: (a, b)) # => ('details', 3) Michael #4: Google AI better than doctors at detecting breast cancer Google’s deep learning AI called LYNA able to correctly identify tumorous regions in lymph nodes 99 per cent of the time. We think of the impact of AI as killing 'low end' jobs [see poster], but these are "doctor" level positions. The presence or absence of these ‘nodal metastases’ influence a patient’s prognosis and treatment plan, so accurate and fast detection is important. In a second trial, six pathologists completed a diagnostic test with and without LYNA’s assistance. With LYNA’s help, the doctors found it ‘easier’ to detect small metastases, and on average the task took half as long. Brian #5: 2018 Advent of Code Another winter break activity people might enjoy is practicing with code challenges. AoC is a fun tradition. a calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. don't need a computer science background to participate don’t need a fancy computer; every problem has a solution that completes in at most 15 seconds on ten-year-old hardware. There’s a leaderboard, so you can compete if you want. Or just have fun. Past years available, back to 2015. Some extra tools and info: awesome-advent-of-code Michael #6: Red Hat Linux 8.0 Beta released, now (finally) updated to use Python 3.6 as default instead of 2.7 First of all, my favorite comment was a correction to the title: legacy python * “Python 3.6 is the default Python implementation in RHEL 8; limited support for Python 2.7 is provided. No version of Python is installed by default.“ Red Hat Enterprise Linux 8 is distributed with Python 3.6. The package is not installed by default. To install Python 3.6, use the yum install python3 command. Python 2.7 is available in the python2 package. However, Python 2 will have a shorter life cycle and its aim is to facilitate smoother transition to Python 3 for customers. Neither the default python package nor the unversioned /usr/bin/python executable is distributed with RHEL 8. Customers are advised to use python3 or python2 directly. Alternatively, administrators can configure the unversioned python command using the alternatives command. Python scripts must specify major version in hashbangs at RPM build time In RHEL 8, executable Python scripts are expected to use hashbangs (shebangs) specifying explicitly at least the major Python version. Extras: Michael: We were featured on TechMeme Long Ride Home podcast. Check out their podcast here. Thank you to Brian McCullough, the host of the show. I just learned about their show through this exchange but can easily see myself listening from time to time. It’s like Python Bytes, but for the wider tech world and less developer focused but still solid tech foundations. Brian: First story was about glom. I had heard of glom before, but got excited after interviewing Mahmoud for T&C 55, where we discussed the difficulty in testing if you use glom or DSLs in general. A twitter exchange and GH issue followed the episode, with Anthony Shaw. At one point, Ant shared this great joke from Brenan Kellar: A QA engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 99999999999 beers. Orders a lizard. Orders -1 beers. Orders a ueicbksjdhd. First real customer walks in and asks where the bathroom is. The bar bursts into flames, killing everyone. — Brenan Keller (@brenankeller) November 30, 2018
December 1, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Dependency Management through a DevOps Lens Python Application Dependency Management in 2018 - Hynek An opinionated comparison of one use case and pipenv, poetry, pip-tools “We have more ways to manage dependencies in Python applications than ever. But how do they fare in production? Unfortunately this topic turned out to be quite polarizing and was at the center of a lot of heated debates. This is my attempt at an opinionated review through a DevOps lens.” Best disclaimer in a blog article ever: “DISCLAIMER: The following technical opinions are mine alone and if you use them as a weapon to attack people who try to improve the packaging situation you’re objectively a bad person. Please be nice.” Requirements: Solution needs to meet the following features: Allow me specify my immediate dependencies (e.g. Django), resolve the dependency tree and lock all of them with their versions and ideally hashes (more on hashes), integrate somehow with tox so I can run my tests, and finally allow me to install a project with all its locked dependencies into a virtual environment of my choosing. Seem like reasonable wishes. So far, none of the solutions work perfectly. A good example of pointing out tooling issues with his use case while being respectful of the people involved in creating other tools. Michael #2: Plugins made simple with pluginlib makes creating plugins for Python very simple it relies on metaclasses, but the average programmer can easily get lost dealing with metaclasses Main Features: Plugins are validated when they are loaded (instead of when they are used) Plugins can be loaded through different mechanisms (modules, filesystem paths, entry points) Multiple versions of the same plugin are supported (The newest one is used by default) Plugins can be blacklisted by type, name, or version Multiple plugin groups are supported so one program can use multiple sets of plugins that won't conflict Plugins support conditional loading (examples: os, version, installed software, etc) Once loaded, plugins can be accessed through dictionary or dot notation Brian #3: How to Test Your Django App with Selenium and pytest Bob Belderbos “In this article I will show you how to test a Django app with pytest and Selenium. We will test our CodeChalleng.es platform comparing the logged out homepage vs the logged in dashboard. We will navigate the DOM matching elements and more.” Michael #4: Fluent collection APIs (flupy and asq) flupy implements a fluent interface for chaining multiple method calls as a single python expression. All flupy methods return generators and are evaluated lazily in depth-first order. This allows flupy expressions to transform arbitrary size data in extremely limited memory. Example: pipeline = flu(count()).map(lambda x: x**2) \ .filter(lambda x: x % 517 == 0) \ .chunk(5) \ .take(3) for item in pipeline: print(item) The CLI in particular has been great for our data science team. Not everyone is super comfortable with linux-fu so having a cross-platform way to leverage python knowledge on the shell has been an easy win. Also if you are LINQ inclined: https://github.com/sixty-north/asq asq is simple implementation of a LINQ-inspired API for Python which operates over Python iterables, including a parallel version implemented in terms of the Python standard library multiprocessing module. # ASQ >>> from asq import query >>> words = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten"] >>> query(words).order_by(len).then_by().take(5).select(str.upper).to_list() ['ONE', 'SIX', 'TEN', 'TWO', 'FIVE'] Brian #5: Guido blogging again What to do with your computer science career Answering “A question about whether to choose a 9-5 job or be an entrepreneur” entrepreneurship isn’t for everyone working for someone else can be very rewarding shoot for “better than an entry-level web development job” And “A question about whether AI would make human software developers redundant (not about what I think of the field of AI as a career choice)” AI is about automating tasks that can be boring Software Engineering is never boring. Michael #6: Web apps in pure Python apps with Anvil Design with our visual designer Build with nothing but Python Publish Instant hosting in the cloud or on-site Paid product but has a free version Covered on Talk Python 138 Extras: Second Printing (P2) of “Python Testing with pytest”
November 23, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Colorizing and Restoring Old Images with Deep Learning Text interview by Charlie Harrington of Jason Antic, developer of DeOldify A whole bunch of machine learning buzzwords that I don’t understand in the slightest combine to make a really cool to to make B&W photos look freaking amazing. “This is a deep learning based model. More specifically, what I've done is combined the following approaches: Self-Attention Generative Adversarial Network Training structure inspired by (but not the same as) Progressive Growing of GANs. Two Time-Scale Update Rule. Generator Loss is two parts: One is a basic Perceptual Loss (or Feature Loss) based on VGG16. The second is the loss score from the critic.” Michael #2: PlatformIO IDE for VSCode via Jason Pecor PlatformIO is an open source ecosystem for IoT development Cross-platform IDE and unified debugger. Remote unit testing and firmware updates Built on Visual Studio Code which has a nice extension for Python PlatformIO, combined with the features of VSCode provides some great improvements for project development over the standard Arduino IDE for Arduino-compatible microcontroller based solutions. Some of these features are paid, but it’s a reasonable price With Python becoming more popular for microcontroller design, as well, this might be a very nice option for designers. And for Jason’s, specifically, it provides a single environment that can eventually be configured to handle doing the embedded code design, associated Python supporting tools mods, and HDL development. The PlatformIO Core written in Python. Python 2.7 (hiss…) Jason’s test drive video from Tuesday: Test Driving PlatformIO IDE for VSCode Brian #3: Python Data Visualization 2018: Why So Many Libraries? Nice overview of visualization landscape, by Anaconda team Differentiating factors, API types, and emerging trends Related: Drawing Data with Flask and matplotlib Finally! A really simple example app in Flask that shows how to both generate and display matplotlib plots. I was looking for something like this about a year ago and didn’t find it. Michael #4: coder.com - VS Code in the cloud Full Visual Studio Code, but in your browser Code in the browser Access up to 96 cores VS Code + extensions, so all the languages and features Collaborate in real time, think google docs Access linux from any OS Note: They sponsored an episode of Talk Python To Me, but this is not an ad here... Brian #5: By Welcoming Women, Python’s Founder Overcomes Closed Minds In Open Source Forbes’s article about Guido and the Python community actively working to get more women involved in core development as well as speaking at conferences. Good lessons for other projects, and work teams, about how you cannot just passively “let people join”, you need to work to make it happen. Michael #6: Machine Learning Basics From Anna-Lena Popkes Plain python implementations of basic machine learning algorithms Repository contains implementations of basic machine learning algorithms in plain Python (modern Python, yay!) All algorithms are implemented from scratch without using additional machine learning libraries. Goal is to provide a basic understanding of the algorithms and their underlying structure, not to provide the most efficient implementations. Most of the algorithms Linear Regression Logistic Regression Perceptron k-nearest-neighbor k-Means clustering Simple neural network with one hidden layer Multinomial Logistic Regression Decision tree for classification Decision tree for regression Anna-Lena was on Talk Python on 186: http://talkpython.fm/186 Extras: Michael: PSF Fellow Nominations are open Michael: Shiboken has no meaning Brian: Python 3.7 runtime now available in AWS Lambda
November 17, 2018
Python Bytes 104 Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #0.1: Chapters and play at Chapters are now in the mp3 file Play at button on the website (doesn’t work on iOS unless you click the play to start it) Michael #0.2: Become a friend of the show https://pythonbytes.fm/friends-of-the-show Or just click “friends of the show” in the navbar Brian #1: wily: A Python application for tracking, reporting on timing and complexity in tests and applications. Anthony Shaw (aka “Friend of the Show”, aka “Ant”) (if listing 2 “aliases, do you just put one “aka” or one per alias?) I should cover this on Test & Code for the content of the package. But it’s the actual packaging that I want to talk about today. Wily is a code base that can be used as an example of embracing pyproject.toml (pyproject.toml discussed on PB 100 and T&C 52) A real nice clean project using newer packaging tools that also has some frequently used bells and whistles NO setup.py file wily’s pyproject.toml includes flit packaging, metadata, scripts tox configuration black configuration project also has testing done on TravisCI rst based docs and readthedocs updates code coverage black pre-commit for wily pre-commit hook for your project to run wily CONTRIBUTING.md that includes code of conduct HISTORY.md with a nice format tests using pytest Michael #2: Latest VS Code has Juypter support In this release, closed a total of 49 issues, including: Jupyter support: import notebooks and run code cells in a Python Interactive window Use new virtual environments without having to restart Visual Studio Code Code completions in the debug console window Improved completions in language server, including recognition of namedtuple, and generic types The extension now contains new editor-centric interactive programming capabilities built on top of Jupyter. have Jupyter installed in your environment (e.g. set your environment to Anaconda) and type #%% into a Python file to define a Cell. You will notice a “Run Cell” code lens will appear above the #%% line: Cells in the Jupyter Notebook will be converted to cells in a Python file by adding #%% lines. You can run the cells to view the notebook output in Visual Studio code, including plots Brian #3: API Evolution the Right Way A. Jesse Jiryu Davis adding features removing features adding parameters changing behavior Michael #4: PySimpleGUI now on Qt Project by Mike B Covered back on https://pythonbytes.fm/episodes/show/90/a-django-async-roadmap Simple declarative UI “builder” Looking to take your Python code from the world of command lines and into the convenience of a GUI? Have a Raspberry Pi with a touchscreen that's going to waste because you don't have the time to learn a GUI SDK? Look no further, you've found your GUI package. Now supports Qt Modern Python only More frameworks likely coming Brian #5: Comparison of the 7 governance PEPs Started by Victor Stinner The different PEPs are compared by: hierarchy number of people involved requirements for candidates to be considered for certain positions elections: who votes, and how term limits no confidence vote teams/experts PEP process core dev promotion and ejection how governance will be updated code of conduct PEP 8000, Python Language Governance Proposal Overview: PEP 8010 - The Technical Leader Governance Model continue status quo (ish) PEP 8011 - Python Governance Model Lead by Trio of Pythonistas like status quo but with 3 co-leaders PEP 8012 - The Community Governance Model no central authority PEP 8013 - The External Governance Model non-core oversight PEP 8014 - The Commons Governance Model core oversight PEP 8015 - Organization of the Python community push most decision-making to teams PEP 8016 - The Steering Council Model bootstrap iterating on governance Michael #6: Shiboken (from Qt for Python project) From PySide2 (AKA Qt for Python) project Generate Python bindings from arbitrary C/C++ code Has a Typesystem (based on XML) which allows modifying the obtained information to properly represent and manipulate the C++ classes into the Python World. Can remove and add methods to certain classes, and even modify the arguments of each function, which is really necessary when both C++ and Python collide and a decision needs to be made to properly handle the data structures or types. Qt for Python: under the hood Write your own Python bindings Other options include: CFFI (example dbader.org) Cython (example: via shamir.stav) Extras: Michael: Mission Python: Code a Space Adventure Game! book Michael: PyCon tickets are on sale Michael: PyCascade tickets are on sale
November 8, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: FEniCS “FEniCS is a popular open-source (LGPLv3) computing platform for solving partial differential equations (PDEs). FEniCS enables users to quickly translate scientific models into efficient finite element code. With the high-level Python and C++ interfaces to FEniCS, it is easy to get started, but FEniCS offers also powerful capabilities for more experienced programmers. FEniCS runs on a multitude of platforms ranging from laptops to high-performance clusters.” Solves partial differential equations efficiently with a combination of C++ and Python. Can be run on a desktop/laptop or deployed to a supercomputer with thousands of parallel processes. is a NumFOCUS fiscally supported project “makes the implementation of the mathematical formulation of a system of partial differential equations almost seamless.” - Sébastien Brisard “FEniCS is in fact a C++ project with a full-featured Python interface. The library itself generates C++ code on-the-fly, that can be called (on-the-fly) from python. It's almost magical... Under the hood, it used to use SWIG, and recently moved to pybind11. I guess the architecture that was set up to achieve this level of automation might be useful in other situations.” - Sébastien Brisard Michael #2: cursive_re via Christopher Patti, created by Bogdan Popa Readable regular expressions for Python 3.6 and up. It’s a tiny Python library made up of combinators that help you write regular expressions you can read and modify six months down the line. Best understood via an example: >>> hash = text('#') >>> hexdigit = any_of(in_range('0', '9') + in_range('a', 'f') + in_range('A', 'F')) >>> hexcolor = ( ... beginning_of_line() + hash + ... group(repeated(hexdigit, exactly=6) | repeated(hexdigit, exactly=3)) + ... end_of_line() ... ) >>> str(hexcolor) '^\\#([a-f0-9]{6}|[a-f0-9]{3})$' Has automatic escaping for [ and \ etc: str(any_of(text("[]"))) → '[\\[\\]]' Easily testable / inspectable. Just call str on any expression. Brian #3: pyimagesearch Adrian Rosebrock is focused on teaching OpenCV with Python Just a really cool resource of integrating computer vision and Python. Both free and paid resources. He had one of the most successful tech learning kickstarters (ever?) on this topic: https://www.kickstarter.com/projects/adrianrosebrock/deep-learning-for-computer-vision-with-python-eboo Michael #4: Visualization of Python development up till 2012 via Ophion Group (on twitter) mercurial (hg) source code repository commit history August 1990 - June 2012 (cpython 3.3.0 alpha) Watch the first minute, then click ahead minute at a time and watch for a few seconds to get the full feel Really interesting to see a visual representation of the growth of an open source ecosystem Built with Gource: https://gource.io/ Amazing video of the history gource and its visualization of various projects: https://vimeo.com/15943704 Who wants to build this for 2012-present? Would make an amazing lightning talk! Brian #5: Getting to 10x (Results): What Any Developer Can Learn from the Best Forget the “10x” bit if that term is fighting words. - Brian’s advice How about just “ways to improve your effectiveness as a developer”? “… there is a clear path to excellence. People aren’t born great developers. They get there through focused, deliberate practice.” traits of great developers problem solver skilled mentor/teacher excellent learner passionate traits to avoid: incompetent arrogant uncooperative unmotivated stubborn Focus on your strengths more than your weaknesses Pick 1 thing to improve on this week and focus on it relentlessly Michael #6: Chaos Toolkit Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system's capability to withstand turbulent conditions in production. Netflix uses the chaos monkey (et. al.) on their systems. Covered on https://talkpython.fm/episodes/show/16/python-at-netflix The Chaos Toolkit aims to be the simplest and easiest way to explore building, and automating, your own Chaos Engineering Experiments. Integrates with Kubernetes, AWS, Google Cloud, Microsoft Azure, etc. To give you an idea, here are some things it can do to aws: lambda: delete_function_concurrency Removes concurrency limit applied to the specified Lambda stop_instance Stop a single EC2 instance. You may provide an instance id explicitly or, if you only specify the AZ, a random instance will be selected. Extras: MK: Malicious Python Libraries Found & Removed From PyPI MK: Some really long type names Brian: Deep dive into pyproject.toml and the future of Python packaging with Brett Cannon follow up from episode 100 Python Bytes
October 31, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: QuantEcon “Open source code for economic modeling” “QuantEcon is a NumFOCUS fiscally sponsored project dedicated to development and documentation of modern open source computational tools for economics, econometrics, and decision making.” Educational resource that includes: Lectures, workshops, and seminars Cheatsheets for scientific programming in Python and Julia Notebooks QuantEcon.py : open source Python code library for economics Michael #2: Structure of a Flask Project Flask is very flexible, it has no certain pattern of a project folder structure. Here are some suggestions. I always keep this one certain rule when writing modules and packages: “Don't backward import from root __init__.py.” Candidate structure: project/ __init__.py models/ __init__.py users.py posts.py ... routes/ __init__.py home.py account.py dashboard.py ... templates/ base.html post.html ... services/ __init__.py google.py mail.py Love it! To this, I would rename routes to views or controllers and add a viewmodels folder and viewmodels themselves. Brian, see anything missing? ya. tests. :) Another famous folder structure is app based structure, which means things are grouped bp application I (Michael) STRONGLY recommend Flask blueprints Brian #3: Overusing lambda expressions in Python lambda expressions vs defined functions They can be immediately passed around (no variable needed) They can only have a single line of code within them They return automatically They can’t have a docstring and they don’t have a name They use a different and unfamiliar syntax misuses: naming them. Just write a function instead calling a single function with a single argument : just use that func instead overuse: if they get complex, even a little bit, they are hard to read has to be all on one line, which reduces readibility map and filter : use comprehensions instead using custom lambdas instead of using operators from the operator module. Michael #4: Asyncio in Python 3.7 by Cris Medina The release of Python 3.7 introduced a number of changes into the async world. Some may even affect you even if you don’t use asyncio. New Reserved Keywords: The async and await keywords are now reserved. There’s already quite a few modules broken because of this. However, the fix is easy: rename any variables and parameters. Context Variables: Version 3.7 now allows the use of context variables within async tasks. If this is a new concept to you, it might be easier to picture it as global variables whose values are local to the currently running coroutines. Python has similar constructs for doing this very thing across threads. However, those were not sufficient in async-world New asyncio.run() function With a call to asyncio.run(), we can now automatically create a loop, run a task on it, and close it when complete. Simpler Task Management: Along the same lines, there’s a new asyncio.create_task() function that helps make tasks that inside the current loop, instead of having to get the loop first and calling create task on top of it. Simpler Event Loop Management: The addition of asyncio.get_running_loop() will help determine the active event loop, and catch a RuntimeError if there’s no loop running. Async Context Managers: Another quality-of-life improvement. We now have the asynccontextmanager() decorator for producing async context managers without the need for a class that implements __aenter__() or __aexit__(). Performance Improvements: Several functions are now optimized for speed, some were even reimplemented in C. Here’s the list: asyncio.get_event_loop() is now 15 times faster. asyncio.gather() is 15% faster. asyncio.sleep() is two times faster when the delay is zero or negative. asyncio.Future callback management is optimized. Reduced overhead for asyncio debug mode. Lots lots more Brian #5: Giving thanks with **pip thank** proposal: https://github.com/pypa/pip/issues/5970 Michael #6: Getting Started With Testing in Python by Anthony Shaw, 33 minutes reading time according to Instapaper Automated vs. Manual Testing Unit Tests vs. Integration Tests: A unit test is a smaller test, one that checks that a single component operates in the right way. A unit test helps you to isolate what is broken in your application and fix it faster. Compares unittest, nose or nose2, pytest Covers things like: Writing Your First Test Where to Write the Test How to Structure a Simple Test How to Write Assertions Dangers of Side Effects Testing in PyCharm and VS Code Testing for Web Frameworks Like Django and Flask Advanced Testing Scenarios Even: Testing for Security Flaws in Your Application Extras: MK: Hack ur name — aka Pivot me bro (done in Python: https://github.com/veekaybee/hustlr ) by Vicki Boykis MK: Python 3.7.1 and 3.6.7 Are Now Available MK: Click-Driven Development (CDD) - via @tombaker Use Python Click package to mock up suite of commands w/options/args. Decorated functions print description of intended results. Replace placeholders with code.
October 24, 2018
Sponsored by DigitalOcean: pythnonbytes.fm/digitalocean Brian #1: Asterisks in Python: what they are and how to use them I just ** love *s Using * and ** to pass arguments to a function * for list, ** for keyword arguments from a dictionary Using * and ** to capture arguments passed into a function Using * to accept keyword-only arguments Using * to capture items during tuple unpacking you can capture the rest if you only want to grab a few Using * to unpack iterables into a list/tuple Using ** to unpack dictionaries into other dictionaries Michael #2: responder web framework From Kenneth Reitz — A familiar HTTP Service Framework Already has 1,393 github stars Flask-like but with async support and A pleasant API, with a single import statement. Class-based views without inheritance. ASGI framework, the future of Python web services. WebSocket support! The ability to mount any ASGI / WSGI app at a subroute. f-string syntax route declaration. Mutable response object, passed into each view. No need to return anything. Background tasks, spawned off in a ThreadPoolExecutor. GraphQL (with GraphiQL) support! OpenAPI schema generation. Single-page webapp support Responder gives you the ability to mount another ASGI / WSGI app at a subroute uvicorn: powers responder and is built on top of uvloop asgi: https://www.encode.io/articles/hello-asgi/ Brian #3: Python Example resource: pythonprogramming.in Lots of examples Python basics including date time, strings, dictionaries pandas, matplotlib, tensorflow basics data structures and algorithms Nice reference, especially for people getting into Python for data science or scientific work. Michael #4: This year’s Nobel Prize in economics was awarded to a Python convert Nordhaus and Romer “have designed methods that address some of our time’s most fundamental and pressing issues: long-term sustainable growth in the global economy and the welfare of the world’s population,” Notably for a 62-year-old economist of his distinction, he is a user of the programming language Python. Romer believes in making research transparent. He argues that openness and clarity about methodology is important for scientific research to gain trust. He tried to use Mathematica to share one of his studies in a way that anyone could explore every detail of his data and methods. It didn’t work. He says that Mathematica’s owner, Wolfram Research, made it too difficult to share his work in a way that didn’t require other people to use the proprietary software, too. Romer believes that open-source notebooks are the way forward for sharing research. He believes they support integrity, while proprietary software encourage secrecy. “The more I learn about proprietary software, the more I worry that objective truth might perish from the earth,” he wrote. Michael covered a similar story for the Nobel Prize in Physics at CERN on Talk Python Jake Vanderplas Keynote at PyCon 2017: “The unexpected effectiveness of Python in Science” Brian #5: More in depth TensorFlow Michael #6: MAKERphone - an educational DIY mobile phone MAKERphone is an educational DIY mobile phone designed to bring electronics and programming to the crowd in a fun and interesting way. A fully functional mobile phone that you can code yourself Games such as space invaders, pong, or snake Apps such as a custom media player that only plays cat videos Programs in Arduino Lines of code in Python Your first working piece of code in Scratch A custom case Extras: MK: Around 62% of all Internet sites will run an unsupported PHP version in 10 weeks The highly popular PHP 5.x branch will stop receiving security updates at the end of the year.
October 19, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guests: Anthony Shaw Dan Bader Brett Cannon Nina Zakharenko Brian #1: poetry “poetry is a tool to handle dependency installation as well as building and packaging of Python packages. It only needs one file to do all of that: the new, standardized pyproject.toml. In other words, poetry uses pyproject.toml to replace setup.py, requirements.txt, setup.cfg, MANIFEST.in and the newly added Pipfile.” poetry can be used for both application and library development handles dependencies and lock files strongly encourages virtual environment use (need specifically turn it off) can be used within an existing venv or be used to create a new venv automates package build process automates deployment to PyPI or to another repository CLI and the use model is very different than pipenv. Even if they produced the same files (which they don’t), you’d still want to try both to see which workflow works best for you. For me, I think poetry matches my way of working a bit more than pipenv, but I’m still in the early stages of using either. From Python's New Package Landscape “PEP 517 and PEP 518—accepted in September 2017 and May 2016, respectively—changed this status quo by enabling package authors to select different build systems. Said differently, for the first time in Python, developers may opt to use a distribution build tool other than **distutils** or **setuptools**. The ubiquitous **setup.py** file is no longer mandatory in Python libraries.” PEP 517 -- A build-system independent format for source trees PEP 518 -- Specifying Minimum Build System Requirements for Python Projects Another project that utilizes pyproject.toml is flit, which seems to overlap quite a bit with poetry, but I don’t think it does the venv, dependency management, dependency updating, etc. See also: Clarifying PEP 518 (a.k.a. pyproject.toml) - From Brett Question for @Brett C 517 and 518 still say “provisional” and not “final”. What’s that mean? We are still allowed to tweak it as necessary before it Biggest difference is poetry uses pyproject.toml (PEP518) instead of Pipfile. Replaces all others (setup.py, setup.cfg, requirements*.txt, manifest.IN) Even its lock file is in TOML Author “does not like” pipenv, or some of the decisions it has made. Note that Kenneth has recently made some calls to introduce more discussion and openness with a PEP-style process called PEEP (PipEnv Enhancement Proposals). E.g. uses a more extensive dependency resolver Pipenv does not support multiple environments (by design) making it useless for library development. Poetry makes this more open. See https://medium.com/@DJetelina/pipenv-review-after-using-in-production-a05e7176f3f0 Wait. Why am I doing your notes for you @Brian O ! (awesome. Thanks Ant.) Brett has had initial discussions on Twitter with both pipenv and poetry about possibly standardizing on a lockfile format so that’s the artifact these tools produce and everything else is tool preference Anthony Shaw #2: pylama and radon Have been investigating tools for measuring complexity and performance of code and how that relates to test If you can refactor your code so the tests still pass, the customers are still happy AND it’s simpler then that’s a good thing - right? Radon is a Python tool that leverages the AST to give statistics on Cyclomatic Complexity (number of decisions — nested if’s are bad), maintainability index (LoC & Halstead) and Halstead (number of operations an complexity of AST). Radon works by adding a ComplexityVisitor to the AST. Another option is Ned Batchelder’s McCabe tool which measures the number of possible branches (similar to cyclomatic) All of these tools are combined in pylama - a code linter for Python and Javascript. Embeds pycodestyle, mccabe, radon, gjslint and pyflakes. Final goal is to have a pytest plugin that fails tests if you make your code more complicated Nina Zakharenko #3: Tools for teaching Python Teaching Python can come with hurdles — virtual environments, installing python3, pip, working with the command line. Put out a call on twitter asking - “What software and tools do you use to teach Python?”. 50 Responses, 414 votes, learned about lots of new tools. Read the thread. 27% use python or ipython repl 13% use built-in IDLE 39% use an IDE or editor - Visual Studio Code, PyCharm, Atom. 21% use other (mix of local and hosted Jupyter notebooks and other responses) New tools I learned about: Mu editor - simple python editor, great for those completely new to programming. Large buttons with common actions above the editor. Support for educational platforms Integrates with hardware platforms -- adafruit Circuit Playground, micro:bit PyGame Awesome tutorials Neuron plugin for VS Code, Hydrogen plugin for Atom Interactive coding environment, brings a taste of Jupyter notebooks into your editor. Targeted towards data scientists. Show evaluated values, output pane to display charts and graphs Import to/from Jupyter notebooks repl.it - open source hosted cloud repl with reasonable free tier project goal - zero effort setup 3 vertical panes: files, editor, repl, and a button to run the current code. no login, no signup needed to get started visual package installation - no running pip, requirements.txt automatically generated includes a debugger bpython - Used it years ago, still an active project. Fancy curses interface to the Python interactive interpreter. Windows, type hints, expected parameters lists. Really cool feature — you can rewind your session! Pops the last line, and the entire session is reevaluated. Easily reload imported modules. Honorable mentions: Edublocks - Teaching tool for kids, visually drag and drop blocks of Python code. Open source, created by Joshua Lowe, a brilliant 14 year old maker and programmer. pythonanywhere, codeskulptor.org, codesters. Dan Bader #4: My favorite tool of 2018: “Black” code formatter by Łukasz Langa Black is the “uncompromising Python code formatter” An opinionated auto-formatter for your code (like YAPF/autopep for Python, or gofmt for golang who popularized the idea) Heard about it in episode #73 by Brian Started using it for some small tools, then rolled it out to the whole realpython.com code base including our public example code repo (https://github.com/realpython/materials) Benefits are: Auto formatting—Not only does it call you out on formatting violations, it auto-fixes them Code style discussions disappear—just use whatever Black does Super easy to make several code bases look consistent (no more mental gymnastics to format new code to match its surroundings) Automatically enforce consistent formatting on CI with “black --check” (I use a combo of flake8 + black because flake8 also catches syntax errors and some other “code smells”) pro-tip: set up a pre-commit hook/rule to automatically run black before committing to Git. Also add it to your editor workflow (reformat on save / reformat on paste) Tool support: Built into the Python extension for VS Code (which Łukasz uses 😉) Plug-in for PyCharm (for Michael and Brian 😁 ) Support in pre-commit For the most part I really like the formatting Black applies, if you’re not a fan you might hate this tool because it makes your code look “ugly” 🙂 Still in beta but found it very useful and helpful as of October 2018. Give it a try! Brett Cannon #5: A Web without JavaScript: Russell Keith-Magee at PyCon AU JavaScript has a monopoly in web browsers for client-side programming Mono-language situations are not good for anyone Can Python somehow break into the client-side web world? Example implementation of Luhn algorithm: JavaScript: 0.4KB Transcrypt: transpile to 32KB Brython: Python compiler for 0.5KB + 646KB bootstrap Batavia: Eval loop for 1.2KB + 5MB bootstrap Pyodide: CPython compiled to WASM for 0.5KB + 3MB bootstrap WASM as a Python target might make this feasible Example written in C compiled to 22KB (w/ a 65KB bootstrap for older browsers) Maybe easier to target Electron/Node instead of client-side web initially? Scott Hanselman’s blog post https://www.hanselman.com/blog/JavaScriptIsWebAssemblyLanguageAndThatsOK.aspx Hanselminutes interview https://hanselminutes.com/638/c-and-browser-monoculture-with-vivaldis-patricia-aas Michael #6: Async WebDriver implementation for asyncio and asyncio-compatible frameworks You’ve heard of Selenium but in an async world what do we use? Answer: arsenic # Example: Let's run a local Firefox instance. async def example(): # Runs geckodriver and starts a firefox session async with get_session(Geckodriver(), Firefox()) as session: # go to example.com await session.get('http://example.com') # wait up to 5 seconds to get the h1 element from the page h1 = await session.wait_for_element(5, 'h1') # print the text of the h1 element print(await h1.get_text()) Use cases include testing of web applications, load testing, automating websites, web scraping or anything else you need a web browser for. It uses real web browsers using the Webdriver specification. Warning: While this library is asynchronous, web drivers are not. You must call the APIs in sequence. The purpose of this library is to allow you to control multiple web drivers asynchronously or to use a web driver in the same thread as an asynchronous web server. Arsenic with pytest Supported browsers Headless Google Chrome Headless Firefox Everyone’s thoughts on async in Python these days? Selenium-Grid https://www.seleniumhq.org/docs/07_selenium_grid.jsp Extra: Take the python survey: https://talkpython.fm/survey2018 3.7.1rc1 is out https://docs.python.org/3.7/whatsnew/changelog.html#python-3-7-1-release-candidate-1 A good review on Python packaging http://andrewsforge.com/article/python-new-package-landscape/ New September release of Python Extension for Visual Studio Code — lots of new features, like automatic environment activation in the terminal, debugging improvements, and more! Submit a talk to PyCascades happening February 2019 in Seattle. Call for proposals closes October 21st. Mentorship available.
October 16, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Forbes cyber article: Cyber Saturday—Doubts Swirl Around Bloomberg's China Chip Hack Report Brian #1: parse “parse() is the opposite of format()” regex not required for parsing strings. Provides these functionalities: export parse(), search(), findall(), and with_pattern() # Note: space around < p > etc added to escape markdown parser safety measures >>> parse("It's {}, I love it!", "It's spam, I love it!") < Result ('spam',) {} > >>> search('Age: {:d}\n', 'Name: Rufus\nAge: 42\nColor: red\n') ( Result (42,) {} ) >>> ''.join(r.fixed[0] for r in findall("", "\< p >the < b >bold< /b > text< /p >")) 'the bold text' Can also compile for repeated use. Michael #2: fman Build System FBS lets you create GUI apps for Windows, Mac and Linux via Michael Herrmann Build Python GUIs, with Qt – in minutes Write a desktop application with PyQt or Qt for Python. Use fbs to package and deploy it on Windows, Mac and Linux. Avoid months of painful work with the proven solutions provided by fbs. Easy Packaging: Unlike other solutions, fbs makes packaging easy. Create installers for your app in seconds and distribute them to your users – on Windows, Mac and Linux! Open Source: fbs's source code is available on GitHub. You can use it for free in open source projects licensed under the GPL. Commercial licenses are also offered. Free under the GPL. If that's too restrictive, a commercial license is 250 Euros once. PyQt's licensing is similar (GPL/Commercial). A license for it is € 450 (source). Came from fman, a dual-pane file manager for Mac, Windows and Linux Brian #3: fastjsonschema Validate JSON against a schema, quickly. Michael #4: IPython 7.0, Async REPL via Nick Spirit Article by Matthias Bussonnier We are pleased to announce the release of IPython 7.0, the powerful Python interactive shell that goes above and beyond the default Python REPL with advanced tab completion, syntactic coloration, and more. Not having to support Python 2 allowed us to make full use of new Python 3 features and bring never before seen capability in a Python Console, see the Python 3 Statement. One of the core features we focused on for this release is the ability to (ab)use the async and await syntax available in Python 3.5+. TL;DR: You can now use async/await at the top level in the IPython terminal and in the notebook, it should — in most of the cases — “just work”. The only thing you need to remember is: If it is an async function you need to await it. Brian #5: molten Michael #6: A Python love letter Dear Python, where have you been all my life? (reddit thread) I am NOT a developer. But, I've tinkered with programming (in BASIC, Visual Basic, Perl, now Python) when needed over the years I decided that I needed to script something, and hoped that learning how to do it in Python was going to take me significantly less time than doing it manually - with the benefit of future timesavings. No, I didn't go from 0 to production in a day. But if my coworkers will leave me alone, I might be in production by the end of the day tomorrow. What I'm working on today isn't super complex — But putting together what I've done so far has just been a complete joy. Overall it feels natural, intuitive, and relatively easy to understand and write the code for the basic things I'm doing - I haven't had this much fun doing stuff with code since the days fooling around with BASIC in my teens. Feedback / comments Welcome to the club. I came up on c++; my job highly trained me in C and assembly but every project I touch I think, wait, "we can do 95% this in python". And we do. I used to have a chip on my shoulder. I wanted to do things the hard way to truly understand them. I went with C++. … I learned that doing things the smart way was better than doing things the hard way and didn't interfere with learning. I felt the exact same way I finally decided to learn it. It's like a breath of fresh air. Sadly there are few things in my life that made me feel like this, Python and Bitcoin both give me the same levels of enjoyment. … I've used Java, Groovy, Scala, Objective-C, C, C++, C#, Perl and Javascript in a professional capacity over the years and nothing feels as natural to me as Python does. The developers truly deserve any donations they get for making it. … Hell my next two planned tattoos are bitcoin and python logos on my wrists. I taught myself Python a little over 3 years ago and I quickly went from not being programmer to being a programmer. … However the real popularity of Python comes from the depth and quality of 3rd party libraries and how easy they are to install. Extra: Brian: Power Mode II
October 8, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Making Etch-a-Sketch Art With Python Really nice write up of methodically solving problems with simplifying the problem space, figuring out what parts need solved, grabbing off the shelf bits that can help, and putting it all together. Plus it would be a fun weekend (or several) project with kids helping. Controlling the Etch-a-Sketch Raspberry Pi, motors, cables, wood fixture Software to control the motors Picture simplification with edge detection with Canny edge detection. Lines to motor control with path finding with networkx library. Example results included in article. Pentium song: https://www.youtube.com/watch?v=qpMvS1Q1sos Michael #2: Dropbox moves to Python 3 They just rolled out one of the largest Python 3 migrations ever Dropbox is one of the most popular desktop applications in the world Much of the application is written using Python. In fact, Drew’s very first lines of code for Dropbox were written in Python for Windows using venerable libraries such as pywin32. Though we’ve relied on Python 2 for many years (most recently, we used Python 2.7), we began moving to Python 3 back in 2015. If you’re using Dropbox today, the application is powered by a Dropbox-customized variant of Python 3.5. Why Python 3? Exciting new features: Type annotations and async & await Aging toolchains: As Python 2 has aged, the set of toolchains initially compatible for deploying it has largely become obsolete Embedding Python To solve build and deploy problem, we decided on a new architecture to embed the Python runtime in our native application. Deep integration with the OS (e.g. smart sync) means native apps are required In future posts, we’ll look at: How we report crashes on Windows and macOS and use them to debug both native and Python code. How we maintained a hybrid Python 2 and 3 syntax, and what tools helped. Our very best bugs and stories from the Python 3 migration. Brian #3: Resources for PyCon that relate to really any talk venue Speaking page Talk proposal tips and resources And the poster session. Way cooler than I originally understood. Mariatta recently published her set of proposals Nice clean examples that don’t look overwhelming There’s also some links to examples at the talk proposal page. Related, on attending PyCon (or other technical conferences): You don't need to be a Pro @ Python to crack the code of Pycon missing: hang out and talk with, ask questions, and possibly help out with communities as part of the Expo. Michael #4: Electron as GUI of Python Applications via Andy Bulka Electron Python is a template of code where you use Electron (nodejs + chromium) as a GUI talking to Python 3 as a backend via zerorpc. Similar to Eel but much more capable e.g. you get proper native operating system menus — and users don’t need to have Chrome already installed. Needs to run zerorpc server and then start electron separately — can be done via the node backend using Electron as a GUI toolkit gets you native menus, notifications installers, automatic updates to your app debugging and profiling that you are used to, using the Chrome debugger ES6 syntax (a cleaner Javascript with classes, module imports, no need for semicolons etc.). Squint, look sideways, and it kinda looks like Python… ;-) the full power of nodejs and its huge npm package repository the large community and ecosystem of Electron How to package this all? Building a deployable Python-Electron App post by Andy Bulka One of the great things about using Electron as a GUI for Python is that you get to use cutting edge web technologies and you don’t have to learn some old, barely maintained GUI toolkit How much momentum, money, time and how many developer minds are focused on advancing web technologies? Answer: it’s staggeringly huge. Compare this with the number of people maintaining old toolkits from the 90’s e.g. wxPython? Answer: perhaps one or two people in their spare time. Which would you rather use? Final quote: And someone please wrap Electron-Python into an IDE so that in the future all we have to do is click a ‘build’ button — like we could 20 years ago. :-) Brian #5: pluggy: A minimalist production ready plugin system docs plugin management and hook system used by pytest A separate package to allow other projects to include plugin capabilities without exposing unnecessary state or behavior of the host project. Michael #6: How China Used a Tiny Chip to Infiltrate U.S. Companies via Eduardo Orochena The attack by Chinese spies reached almost 30 U.S. companies, including Amazon and Apple, by compromising America’s technology supply chain, according to extensive interviews with government and corporate sources. In 2015, Amazon.com Inc. began quietly evaluating a startup called Elemental Technologies, a potential acquisition to help with a major expansion of its streaming video service, known today as Amazon Prime Video. (from Portland!) To help with due diligence, AWS, which was overseeing the prospective acquisition, hired a third-party company to scrutinize Elemental’s security servers were assembled for Elemental by Super Micro Computer Inc., a San Jose-based company (commonly known as Supermicro) that’s also one of the world’s biggest suppliers of server motherboards Nested on the servers’ motherboards, the testers found a tiny microchip, not much bigger than a grain of rice, that wasn’t part of the boards’ original design. Amazon reported the discovery to U.S. authorities, sending a shudder through the intelligence community. Elemental’s servers could be found in Department of Defense data centers, the CIA’s drone operations, and the onboard networks of Navy warships. And Elemental was just one of hundreds of Supermicro customers. During the ensuing top-secret probe, which remains open more than three years later, investigators determined that the chips allowed the attackers to create a stealth doorway into any network that included the altered machines. Multiple people familiar with the matter say investigators found that the chips had been inserted at factories run by manufacturing subcontractors in China. One government official says China’s goal was long-term access to high-value corporate secrets and sensitive government networks. No consumer data is known to have been stolen. American investigators eventually figured out who else had been hit. Since the implanted chips were designed to ping anonymous computers on the internet for further instructions, operatives could hack those computers to identify others who’d been affected. Extra: Michael's Async course talkpython.fm/async
September 28, 2018
Sponsored by DataDog -- pythonbytes.fm/datadog Brian #1: Making a PyPI-friendly README twine now checks for rendering problems with README Install the latest version of twine; version 1.12.0 or higher is required: pip install --upgrade twine Build the sdist and wheel for your project as described under Packaging your project. Run twine check on the sdist and wheel: twine check dist/* This command will report any problems rendering your README. If your markup renders fine, the command will output Checking distribution FILENAME: Passed. Michael #2: Java goes paid Oracle's new Java SE subs: Code and support for $25/processor/month Prepare for audit after inevitable change, says Oracle licensing consultant There’s also a little bit of stick to go with the carrot, because come January 2019 Java SE 8 on the desktop won’t be updated any more … unless you buy a sub. The short version is that every commercial enterprise needs to look at their Java SE (Standard Edition) usage to see if they need to do something with licensing. Brian #3: Absolute vs Relative Imports in Python Review of how imports are used, along with subpackages and from ex: from package.sub import func Relative: what does this mean: from .some_module import some_class from ..some_package import some_function from . import some_class Michael #4: pyxel - A retro game engine for Python Thanks to its simple specifications inspired by retro gaming consoles, such as only 16 colors can be displayed and only 4 sounds can be played back at the same time, you can feel free to enjoy making pixel art style games. Run on Windows, Mac, and Linux Code writing with Python3 After installing Pyxel, the examples of Pyxel will be copied to the current directory with the following command: install_pyxel_examples Brian #5: Click 7.0 Released Changelog Drop support for Python 2.6 and 3.3. Add native ZSH autocompletion support. Usage errors now hint at the --help option Really long list of changes since the last release at the beginning of 2017 Michael #6: How we spent 30k USD in Firebase in less than 72 hours the largest crowdfunding campaign in Colombia, collecting 3 times more than the previous record so far in only two days! Run on the Vaki platform -- subject of this article We had reached more than 2 million sessions, more than 20 million pages visited and received more than 15 thousand supports. This averages to a thousand users active on the site in average and collecting more than 20 supports per minute. Site was running slow, tried things like upgraded the frontend frameworks Logged into Firebase: had spent $30,356.56 USD in just 72 hours! Going at $600/hr All came down to a very bad implementation of this.loadPayments(). Comments are interesting It could happen to any of us, it happened to me this month. Extras: Dropbox has upgraded from Python 2 → 3! Michael’s async course is live: Async Techniques and Examples in Python 2019 PyCon CFPs open PyCascades CFP is open until mid-Oct
September 22, 2018
Sponsored by DigitalOcean -- pythonbytes.fm/digitalocean Brian #1: Plumbum: Shell Combinators and More Toolbox of goodies to do shell-like things from Python. “The motto of the library is “Never write shell scripts again”, and thus it attempts to mimic the shell syntax (shell combinators) where it makes sense, while keeping it all Pythonic and cross-platform.” Example: >>> from plumbum.cmd import grep, wc, cat, head >>> chain = ls["-a"] | grep["-v", "\\.py"] | wc["-l"] >>> print chain /bin/ls -a | /bin/grep -v '\.py' | /usr/bin/wc -l >>> chain() u'13\n' >>> ((cat < "setup.py") | head["-n", 4])() u'#!/usr/bin/env python\nimport os\n\ntry:\n' >>> (ls["-a"] > "file.list")() u'' >>> (cat["file.list"] | wc["-l"])() u'17\n' Michael #2: Windows 10 Linux subsystem for Python developers via Marcus Sherman “One of the hardest days in teaching introduction to bioinformatics material is the first day: Setting up your machine.” While I have seen a very large bias towards Macs in academia, there are plenty of people that keep their Windows machines as a badge of pride... Marcus included. Even though Anaconda is cross platform and helpful, how does this work on Windows? python3 -m venv .env and source .env/bin/activate? Spoiler alert: Not well. Step by step getting Ubuntu on Windows Shows how to setup an x-server Brian #3: Type hints cheat sheet (Python 3) Do you remember how to type hint duck types? Something accessed like an array (list or tuple or …) and holds strings → Sequence[str] Something that works like a dictionary mapping integers to strings → Mapping[int, str] As I’m adding more and more typing to interface functions, I keep this cheat sheet bookmarked. Michael #4: Python driving new languages Here are five predictions for what programming will look like 10 years from now. Programming will be more abstract Trends like serverless technologies, containers, and low code platforms suggest that many developers may work at higher levels of abstraction in the future AI will become part of every developer's toolkit—but won't replace them A universal programming language will arise To reap the benefits of emerging technologies like AI, programming has to be easy to learn and easy to build upon "Python may be remembered as being the great-great-great grandmother of languages of the future, which underneath the hood may look like the English language, but are far easier to use," Every developer will need to work with data Programming will be a core tenet of the education system Brian #5: asyncio documentation rewritten from scratch twitter thread by Yury Selivanov‏ “Big news! asyncio documentation has been rewritten from scratch! Read the new version here: https://docs.python.org/3/library/asyncio.html …. Huge thanks to @WillingCarol, @elprans, and @andrew_svetlov for support, ideas, and reviews!’ “BTW, this is just the beginning. We'll continue to refine and update the documentation. Next up is adding two tutorials: one teaching high-level concepts and APIs, and another teaching how to use protocols and transports. A section about asyncio architecture is also planned.” “And this is just the beginning not only for asyncio documentation, but for asyncio itself. Just for Python 3.8 we plan to add: new streaming API TaskGroups and cancel scopes Supervisors and tracing API new SSL implementation many usability improvements” Michael #6: The 2018 Python Language Summit Here are the sessions: Subinterpreter support for Python: a way to have a better story for multicore scalability using an existing feature of the language. Subinterpreters will allow multiple Python interpreters per process and there is the potential for zero-copy data sharing between them. But subinterpreters share the GIL, so that needs to be changed in order to make it multicore friendly. Modifying the Python object model: looking at changes to CPython data structures to increase the performance of the interpreter. - via Instagram and Carl Shapiro - By modifying the Python object model fairly substantially, they were able to roughly double the performance - A little controversial - Shapiro's overall point was that he felt Python sacrificed its performance for flexibility and generality, but the dynamic features are typically not used heavily in performance-sensitive production workloads. A Gilectomy update: a status report on the effort to remove the GIL from CPython. Larry Hastings updated attendees on the status of his Gilectomy project. Since his status report at last year's summit, little has happened, which is part of why the session was so short. He hasn't given up on the overall idea, but it needs a new approach. Using GitHub Issues for Python: a discussion on moving from bugs.python.org to GitHub Issues. Mariatta Wijaya described her reasoning for advocating moving Python away from its current bug tracker to GitHub Issues. it would complete Python's journey to GitHub that started a ways back. Shortening the Python release schedule: a discussion on possibly changing from an 18-month to a yearly cadence. The Python release cycle has an 18-month cadence; a new major release (e.g. Python 3.7) is made roughly on that schedule. But Łukasz Langa, who is the release manager for Python 3.8 and 3.9, would like to see things move more quickly—perhaps on a yearly cadence. Unplugging old batteries: should some older, unloved modules be removed from the standard library? Python is famous for being a "batteries included" language—its standard library provides a versatile set of modules with the language There may be times when some of those batteries have reached their end of life. Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work. Linux distributions and Python 2: the end of life for Python 2 is coming, what distributions are doing to prepare. Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work. To figure out how to help the Python downstreams so that Python 2 can be fully discontinued. Python static typing update: a look at where static typing is now and where it is headed for Python 3.7. Started things off by talking about stub files, which contain type information for libraries and other modules. Right now, static typing is only partially useful for large projects because they tend to use a lot of packages from the Python Package Index (PyPI), which has limited stub coverage. There are only 35 stubs for third-party modules in the typeshed library, which is Python's stub repository. He suggested that perhaps a centralized library for stubs is not the right development model. Some projects have stubs that live outside of typeshed, such as Django and SQLAlchemy. PEP 561 ("Distributing and Packaging Type Information") will provide a way to pip install stubs from packages that advertise that they have them. Python virtual environments: a short session on virtual environments and ideas for other ways to isolate local installations. Steve Dower brought up the shortcomings of Python virtual environments, which are meant to create isolated installations of the language and its modules. Thomas Wouters defended virtual environments in a response: The correct justification is that for the average person, not using a virtualenv all too soon creates confusion, pain, and very difficult to fix breakage. Starting with a virtualenv is the easiest way to avoid that, at very little cost. But Beazley and others (including Dower) think that starting Python tutorials or training classes with a 20-minute digression on setting up a virtual environment is wasted time. PEP 572 and decision-making in Python: a discussion of the controversy around PEP 572 and how to avoid the thread explosion that it caused in the future. The "PEP 572 mess" was the topic of a 2018 Python Language Summit session led by benevolent dictator for life (BDFL) Guido van Rossum. Getting along in the Python community: trying to find ways to keep the mailing list welcoming even in the face of rudeness. About tkinter… Mentoring and diversity for Python: a discussion on how to increase the diversity of the core development team. Victor Stinner outlined some work he has been doing to mentor new developers on their path toward joining the core development ranks Mariatta Wijaya gave a very personal talk that described the diversity problem while also providing some concrete action items that the project and individuals could take to help make Python more welcoming to minorities. Extras Listener feedback: CUDA is NVidia only, so no MacBook pro unless you have a custom external GPU.
September 15, 2018
Sponsored by DataDog -- pythonbytes.fm/datadog Brian #1: dataset: databases for lazy people dataset provides a simple abstraction layer removes most direct SQL statements without the necessity for a full ORM model - essentially, databases can be used like a JSON file or NoSQL store. A simple data loading script using dataset might look like this: import dataset db = dataset.connect('sqlite:///:memory:') table = db['sometable'] table.insert(dict(name='John Doe', age=37)) table.insert(dict(name='Jane Doe', age=34, gender='female')) john = table.find_one(name='John Doe') Michael #2: CuPy GPU NumPy A NumPy-compatible matrix library accelerated by CUDA How many cores does a modern GPU have? CuPy's interface is highly compatible with NumPy; in most cases it can be used as a drop-in replacement. You can easily make a custom CUDA kernel if you want to make your code run faster, requiring only a small code snippet of C++. CuPy automatically wraps and compiles it to make a CUDA binary PyCon 2018 presentation: Shohei Hido - CuPy: A NumPy-compatible Library for GPU Code example >>> # This will run on your GPU! >>> import cupy as np # This is the only non-NumPy line >>> x = np.arange(6).reshape(2, 3).astype('f') >>> x array([[ 0., 1., 2.], [ 3., 4., 5.]], dtype=float32) >>> x.sum(axis=1) array([ 3., 12.], dtype=float32) Brian #3: Automate Python workflow using pre-commits We covered pre-commit in episode 84, but I still had trouble getting my head around it. This article by LJ Miranda does a great job with the workflow introduction and configuration necessary to get pre-commit working for black and flake8. Includes a nice visual of the flow. Demo of it all in action with a short video. Michael #4: py-spy Sampling profiler for Python programs Written by Ben Frederickson Lets you visualize what your Python program is spending time on without restarting the program or modifying the code in any way. Written in Rust for speed Doesn't run in the same process as the profiled Python program Does NOT it interrupt the running program in any way. This means Py-Spy is safe to use against production Python code. The default visualization is a top-like live view of your python program How does py-spy work? Py-spy works by directly reading the memory of the python program using the process_vm_readv system call on Linux, the vm_read call on OSX or the ReadProcessMemory call on Windows. Brian #5: SymPy is a Python library for symbolic mathematics “Symbolic computation deals with the computation of mathematical objects symbolically. This means that the mathematical objects are represented exactly, not approximately, and mathematical expressions with unevaluated variables are left in symbolic form.” example: >>> integrate(sin(x**2), (x, -oo, oo)) √2⋅√π ───── 2 examples on site are interactive so you can play with it without installing anything. Michael #6: Starlette ASGI web framework The little ASGI framework that shines. It is ideal for building high performance asyncio services, and supports both HTTP and WebSockets. Very flask-esq Can use ultrajson - Ultra fast JSON decoder and encoder written in C with Python bindings aiofiles for file responses Run using uvicorn Extras: Michael: PyCon 2019 dates out, put them on your calendar! Tutorials: May 1-2 • Wednesday, Thursday Talks and Events: May 3–5 • Friday, Saturday, Sunday Sprints: May 6–9 • Monday through Thursday Listener follow up on git pre-commit hooks util: pre-commit package Matthew Layman, @mblayman Heard the discussion about Git commit hooks at the end. I wanted to bring up pre-commit as an interesting project (written in Python!) that's useful for Git commit hooks. tl;dr: $ pip install pre-commit $ ... create a .pre-commit-config.yaml $ pre-commit install # This is a one time operation. pre-commit's job is to manage a project's Git commit hooks. We use this on my team at work and the devs only need to run pre-commit install. This saves us from a bunch of failing CI builds where flake8 or other code style checks would fail. We use pre-commit to run flake8 and black before allowing a commit to proceed. Some projects have a pre-commit configuration to use right out of the box (e.g., black https://github.com/ambv/black#version-control-integration). Listener: You don't need that (pattern) John Tocher PyCon AU Talk Called "You don't need that” - by Christopher Neugebauer, it was an interesting take on why with a modern and powerful language like python, you may not need the conventionally described design patterns, ala the "Gang of four".
Loading earlier episodes...
    15
    15
      0:00:00 / 0:00:00