OSDC for Perl developers 2023.01

Participants

Chenhui Niu Darren Harwood Simon Windsor Steve Rogerson

Mentors

Gabor Szabo

https://osdc.code-maven.com/osdc-2023-01-perl/

  • Start day: 2023.01.24

Video Playlist

TOC

Session 1: Welcome, Version Control, Journal, Slack

  • Welcome

    • overview of the course
      • git
      • GitHub
      • (GitLab)
      • Markdown
      • Docker
      • Testing
      • Static analysis
      • Communication
      • Slack
    • a little about myself
      • Self employed
      • Training
      • Introducing testing, CI etc.
    • If you'd like to send me an email reply to the one I sent you. Keep the subject line. Remove the irrelevant content. Without this it is very difficult for me to associate all the emails with the different courses I teach.
    • Assignments
      • Will be in some public GitHub or GitLab repositories
      • At the end of each assignment you'll write a report - a blog post / journal entry.
      • You will add it to your personal JSON file and send a Pull-Request with the change. (We'll learn these soon)
      • First few assignments will be to my projects or your own projects. This allows for quick feedback and integration.
      • Then we'll find you open source projects maintained by other people.
    • The more you participate, the more you learn in this course.
    • The more effort you put in this course, the more you will gain.
      • Ask questions!
      • Try to help others! The more you help others the more you will learn.
    • Grades: (if relevant) are based on the work done during the course. There is no end-project or exam at the end.
  • Version Control

  • GitHub: process of contributing to an Open Source project using the GitHub web site. Editing and sending a Pull-Request. Use a the cm-demo user to make a change in the README of this repository and then to add the json file. Show how the CI fails when we add an incorrectly formatted file.

  • What is JSON?

  • Show the Git repository of the project and the web site generated from it.

  • Show blogging platform

  • We saw the drawing of the GitHub process in the cloud.

  • We looked at CPAN-Digger

  • We looked at the recent on MetaCPAN
  • We looked at some of the features of GitHub while looking at the Dancer project. (Insights, list of commits, forks, stars, watch)
  • The original Markdown
  • GitHub flavored Markdown
  • Markdown.

    • Subtitle
    • Bullet points
    • Links
    • Bold
  • Video 1-1

    • 00:00 Welcome
  • Video 1-2

Assignment 1

  • You will have to publish a journal of your process.
  • Create an account on the blogging platform you selected. (if you already have one, you can use that)
  • Create an account on GitHub (if you already have one, use that)
  • Create an account on GitLab (if you already have one, use that)
  • Add a picture to all these accounts. It is preferably a picture of you, but it can be a drawing of you, or some other avatar you might want to use.
  • Send a pull-request to the GitHub repository of the course adding a JSON file. The name of the file should be your GitHub username and it should include key-value pairs as in the example. (The posts will be an empty list.) Check the result of GitHub Actions.
  • Join the Slack workspace (I send invitations to everyone to their email address.) and say hi.
  • Write a blog post about the course. In your post link to your GitHub and GitLab accounts and to your Pull-Request. If you encountered any issue, write about that and how you solved it. If you use an avatar instead of your own picture, describe how you created the avatar.
  • In the blog post tell us a bit about your background.
    • What programming language(s) you use?
    • Which interesting 3rd-party libraries do you use? You can mention big ones, but it is probably more interesting if you mention more esoteric ones.
    • Include links to the home-page of each project and the GitHub/GitLab repository of each project.
    • What would you like to accomplish in the course?
    • Which open source projects would you like to contribute to.
  • Update your Pull-request adding the URL to the blog post to the posts field in the json file.

  • There are 3 GitHub repositories with lists of GitHub organization published by higher education institutions, governments, and corporations. Find at least 5 more organizations that share some of their code using an open source license in GitHub or GitLab. An organization can be a corporation, a university, a college, a research institute, or a government. (e.g. find a list of universities and use the search feature of GitHub to find GitHub organizations that belong to the institute).

A couple of suggestions for the blog posts * Use a title that can sound interesting to others as well eg. How to contribute to an open source project or How to Send a pull request on GitHub. * Add osdc tag and other relevant tags. * Add the series: field to the Jekyll front matter (the header of each post on DEV.to) * Use Markdown in the post. * Include links to the relevant sites and pages such as the web site of the Open Source Development Course and the web site of our course: Open Source Development Course for Perl developers.

Session 2: Create GitHub Pages using the git CLI; GitHub Actions

  • HTML - Hyper Text Markup Language

    • just view source in a browser
  • GitHub pages https://cm-demo.github.io/

    • Plain Markdown files in the docs/ folder
    • Configuring GitHub Actions with Jekyll in the .github/workflows/ folder. Then we changed the source to be docs.
  • Git configuration

git config --global --add user.name "Foo Bar" git config --global --add user.email foo@bar.com

These commands created the ~/.gitconfig file.

ssh-keygen Add public key to GitHub in User setting area

``` git clone

git status git diff git add git commit git show SHA git push git remote -v

git blame

git pull # both with merge and rebase ```

In ~/.gitconfig set the default action for pull:

[pull] rebase = true

We also saw:

gitk --all

Assignment 2

  • Set up your own website on github pages

Once it is done add the following entry to your JSON file: "github_page": true,

See the mentors/szabgab.json for an example.

  • Collect the git repositories of the projects you depend on. If CPAN modules then MetaCPAN might have the link.
  • Add them as a list to your JSON file. See the mentors/szabgab.json for an example.
  • Blog about what we learned. Add links. (See my suggestions above how to improve your blog post.) If you feel something is missing from my notes (this file). Feel free to add them with a PR.

Session 3: GitHub Actions, CPAN Digger

Specifically we looked at * Bash * PostgreSQL * Perl with Makefile.PL * OSDC Site generator

Assignment 3

  • Find at least 2 Perl modules on CPAN Digger that has "something missing". Send a pull-request to each one of them.
    • Look at CPAN Digger
    • Some distributions have a link to their GitHub repository but not to the "issues".
    • Some distributions don't have a link to their GitHub. In some cases it isn't hard to track down the repository and then you can change it to make it include the links to the GitHub repository and to the issues.
  • Blog about them!

Session 4: Docker

(2023.02.28)

We set out working on one of the Perl modules listed in the JSON files in the participants/ folder, but then ended up covering Docker.

We used many examples from the Docker slides to introduce Docker.

We saw the Docker image I use to deal with various open source projects.

We added Dist::Zilla to this image.

Assignment 4

  • Create a Docker image for yourself.
  • Run the tests of one of your favorite modules inside the container.
  • Write a blog post linking to the issues you opened and the Pull-Request you sent. Even if they have not been accepted. Then add the link of that blog post to your JSON file in the participants folder.

Session 5: Docker HUB; Docker Compose

The configuration file of docker to map all the data (all the volumes, images, containers) to a disk which is not the default.

$ cat /etc/docker/daemon.json { "data-root": "/home/data/docker" }

  • Show one of my real-world projects using Docker compose
  • Show editing a project while it is running in Docker compose

Take one of the projects from the list of projects of the participants, run the tests locally using the Docker container (dr). Open issue when necessary. Set up GitHub Actions if needed. Create test coverage report.

  • We looked at https://github.com/reneeb/Types-RENEEB and found out that Steve already had a fork and that his changes were already applied to the original repo. (The best course of action at this point might be is to remove the fork and create it again.)

    • We found that the project had clear instructions on how to set up the development environment and how to run the tests. Nice.
    • We also found that some of the tests fail. Not so good. Steve will check why. He will probably open an issue with the failure. Even if later he finds out the reason, it is a good idea to have it documented on GitHub.
  • Video 5-1

  • Video 5-2

Assignment 5

  • Pick one or more projects
  • Try to setup the local development environment.
    • If you cannot, open an issue asking the author how. (Feel free to mention me by including @szabgab and mentioning the course by including a link to to https://osdc.code-maven.com/
    • If you can setup verify that there are clear instructions how to do this in the README file or in some other file (e.g. CONTRIBUTING).
      • If there are not, send a PR with the instructions so the next person will have less trouble.
    • Run the tests locally.
      • If there are failures report them.
    • Check if GitHub Actions is configured. If not, then configure it. (or open an issue asking the author if s/he wants it).
    • Create test coverage report. Add more tests if possible.

Session 6: GitHub Actions for Types::RENEEB

git clone git@github.com:reneeb/Types-RENEEB.git

  • Create a fork of the repository via GitHub UI.

  • Setup a new git remote to point to the forked repository:

git remote add fork git@github.com:yewtc/Types-RENEEB.git

  • Create a branch to prepare a pull-request

git checkout -b BRANCH make changes git add . git commit -m "..." git push --set-upstream fork BRANCH

  • Send the pull-request from the GitHub UI

  • Integrate the progress of original repository to our local clone

git checkout master git pull origin master

  • Once the Pull-request was accepted we could delete the branch locally and remotely

git branch -d BRANCH git push -d origin BRANCH

  • We created Test coverage report of the Types-RENEEB pacakge. It was not too interesting as everything was 100%.

``` cpanm Dist::Zilla::App::Command::cover

dzil cover ```

Session 7: GitHub Action for DBIx-Class, ack

In this session we tried to add GitHub Actions (CI) to DBIx::Class.

It already had an extensive configuration of Travis-CI, but unfortunately Travis stopped its free service.

I briefly explained how my slides used to be generated using a Webhook.

We used the playground Docker image we covered in a previous session.

We still had to install Module::Install as it was a developer dependency.

cpanm --notest Module::Install

Then we could install all the other dependencies:

cpanm --installdeps --notest .

However when we ran perl Makefile.PL it still complained about a lot of missing modules. So we created a temporary file just to install those:

cpanm --notest Class::DBI::Plugin::DeepAbstractSearch cpanm --notest Class::MethodCache cpanm --notest Class::Unload cpanm --notest Date::Simple cpanm --notest DateTime::Format::MySQL cpanm --notest DateTime::Format::Pg cpanm --notest DateTime::Format::SQLite cpanm --notest DateTime::Format::Strptime cpanm --notest JSON cpanm --notest JSON::Any cpanm --notest JSON::DWIW cpanm --notest JSON::XS cpanm --notest Math::Base36 cpanm --notest MooseX::Types::JSON cpanm --notest MooseX::Types::LoadableClass cpanm --notest MooseX::Types::Path::Class cpanm --notest PadWalker cpanm --notest Pod::Coverage cpanm --notest SQL::Translator cpanm --notest Test::EOL cpanm --notest Test::NoTabs cpanm --notest Test::Pod cpanm --notest Test::Pod::Coverage cpanm --notest Test::Strict cpanm --notest Text::CSV cpanm --notest Time::Piece::MySQL

There must be a better way to install all of these, but we could not find the instructions. So one addition to the project could be an easy way to find insttruction on how to set up the local development environment.

After this we managed to run the tests though many were skipped for various reasons. For example some needed access to real databases.

Then we set up CI based on one of the GitHub Actions examples.

We pushed it out, but it did not start to work. Yesterday GitHub had a big outage, so we looked at the GitHub status and indeed, GitHub Actions was yellow. It did not work properly. Eventually it started our job, but it took several minutes to do so.

To our (or at least my) surprise, once GitHub Actions started to run the tests passed on the first attempt.

However we wanted to see how to run the tests with PostgreSQL as well.

For that we had to install DBD::Pg, another module that does not have a CI. (neither to DBD::mysql nor DBD::Oracle for that matter.

To install DBD::Pg locally we found in the readme of DBD::Pg which is linked from the DBD-Pg page. (I did not remember that during our session.)

apt-get install libpq-dev cpanm --notest DBD::Pg

Then we added another YAML file for the GitHub Action, this time running insied a Docker container, based on the PostgreSQL GitHub Action example.

It needed some parameters to pass the hostname of the Postgres server, the name of the database, the username and the password.

I briefly used and mentioned ack aka. beyondgrep an excellent grep-like tool in Perl.

To our surprise this too worked on the first try. I was so surprised I had to see a more detailed report and thus we split the GitHub Action job up and had a separate section where we used prove to run the tests in verbose mode.

At the end we sent a pull-request with what we accomplished.

Session 8: DBD::Pg, test coverage with Devel::Cover

Session 9: git stash, detached head

  • git

git add -i git stash -u git stash list git stash show git stash show -p

Session 10: git stash and bisect, adding test to RSRU

Date: 2023.04.25

git detached head merge / rebase reset stash bisect revert

Session 11: Git Flow; GitHub Actions for Test::Class

Date: 2023.05.02

  • We talked about Git Flow vs. having a single main branch. The latter would need a good (acceptance/integration) test coverage so you can trust the test for a release without requiring manual QA. (The manual QA can still work as exploratory testing that also feeds ideas to the people who then convert the test-cases into automated tests.

  • We looked at the CI system of Test::Class that first creates a release and then uses that release on various versions of Perl to test it. It also tests a number of downstream distributions to verify that the new version does not break them.

  • We looked at the GitHub Actions and the source code of CPAN Dashboard
  • We looked at the MetaCPAN tools and the experimental dashboard there.
  • We looked at the stats page of PyDigger and the mess of the licenses.
  • CPAN Rocks

  • Video 11-1

  • Video 11-2

Session 12: GitHub Actions; faster CI

  • Date: 2023.05.16

  • GitHub Action when files change

  • GitHub Actions with parameters

  • We talked about frequency of Pull-Requests or code-reviews.

  • Docker caching - changing the Makefile.PL outside will make the COPY Makefile.PL . command run again and that will trigger all the steps that follow it.

  • We talked about CI processes that take too much time to run. Apparently I said earlier that they should run under 10 minutes which of course is just a randome number.

  • A few ideas how to deal with CI processes that take too long.
  • Ideally we would like to run all the tests on every push and we would like all the tests to finish within 1 second to get feedback quickly.
  • Only try to compile Perl scripts if one of the modules change.
  • Add more hardware so jobs can be run in parallel.
  • Divide the CI to several tiers.
    • One that runs fast (1-2 min ?) and hopefully checks crtirical things. Run these on every commit.
    • Another one that runs longer (e.g. 10 minutes) and we'll run it only every 10-15 minutes. Maybe we miss some of the pushes.
    • Another tier that takes several hours. Run these only once the earlier one finishes.
  • Instead of rebuilding the database every time using Perl, dump the data and load it with the tools of the database. That should be much faster.
  • Cache installations of 3rd party libraries.

  • Video 12-1

  • Unfortunately I've forgotten to record the 2nd part.

Session 13: CVEs and lack of maintenance of CPAN modules

Date 2023.05.23