Skip to main content Start main content

Upcoming workshops on IT support for research

New semester begins and a new round of training workshops to support research activities will be offered to students in September 2021. The topics of the training cover the use of GitLab service, Pilot HPC platform and machine learning with Python.

 

003_workshops-for-IT-support-for-research_a

 

Gitlab Service for Research

The 'GitLab Service for Research' is a web-based DevOps (development + operation) lifecycle tool for software development. It provides PolyU researchers with on premises Git repository as an alternative for storing code base and implementing software project management. This workshop introduces and shares tips on the 'GitLab Service for Research'.

Date: 10 Sep (Fri)
Time: 14:30 – 17:00
Venue: Online and On-site
Pre-requisite: Basic OS (Linux & Windows) and programming knowledge
Target Audience: Rpg, Tpg, Ug Students
Medium of Instruction: English

Course Outline:

  1. Introduction to GitLab Service for Research
  2. Basic Operations of Git (Clone, push and pull)
  3. Branching of Git
  4. Managing Users and Groups under GitLab
  5. Permission of Project
  6. Examples of Integrating Pilot HPC Platform and GitLab

Registration: click here

 

003_workshops-for-IT-support-for-research_b

 

Machine Learning with Python (1)

This workshop, consists of six sessions, aims at introducing participants the general workflow of building machine learning models using Python library 'scikit-learn' with practical examples.

Basic categories of machine learning, supervised machine learning algorithms, unsupervised learning algorithms, model validation methods, over-sampling and under-sample techniques will also be covered. This workshop provides participants the basic knowledge and skills to construct machine learning models.

Date: 14 Sep (Tue), 16 Sep (Thu), 23 Sep (Thu), 28 Sep (Tue), 30 Sep (Thu), 5 Oct (Tue)
Time: 14:30 – 17:00
Venue: Online and On-site
Pre-requisite: Basic programming concepts
Target Audience: Rpg, Tpg, Ug Students
Medium of Instruction: English
Certificate: Attended at least 5 lessons out of 6 lessons

Course Outline:

Lesson 1

  1. Introduction to Machine Learning
    1. Supervised learning, unsupervised learning and reinforcement learning
    2. Feature engineering
      - Numerical data
      - Categorical data
      - Text feature
      - Image feature
    3. Modal pipeline
  2. Naïve Bayes Classifier
    1. Conditional probability and Bayes Theorem
    2. Gaussian Naïve Bayes
    3. Multinomial Naïve Bayes

 

Lesson 2

  1. Linear Regression
    1. Formulation and Gradient Descent
    2. Regression variations
      - Simple linear regression
      - Multiple linear regression
      - Basis function regression
    3. Regularization
      - Ridge, lasso and elastic net
  2. Logistic Regression
    1. Formulation and Cost function (log loss)
    2. Example on breast cancer dataset

 

Lesson 3

  1. Support Vector Machine
    1. Basic Linear Algebra
    2. SVM optimization problem
      - Linear and nonlinear boundary
      - Soft margins
    3. SVM on face recognition
  2. Decision Tree and Random Forest
    1. Decision Tree
      - Terminology and mathematical expression
      - Decision boundary
    2. Random Forest
      - Classification and regression
    3. Visualizing tree models
    4. Problems with Tree-based algorithm
      - Overfitting
      - Bias on imbalance dataset

 

Lesson 4

  1. Principal Component Analysis
    1. Linear algebra prerequisite
      - Orthogonal basis, eigenvectors and eigenvalues, covariance matrix
    2. Applications of PCA
      - Dimensional reduction
      - Visualization of high dimensional data
      - Noise filtering
    3. Example: combine application with SVM to improve performance on face recognition
  2. K-Means Clustering
    1. Lloyd’s algorithm
    2. Challenges of using K-means
    3. Non-linear boundary problems
      - Spectral clustering
    4. Use cases
      - Data clustering and color compression

 

Lesson 5

  1. Modal Validation I
    1. Evaluation metrics for classification
      - Accuracy, precision, recall, f1 score
    2. Evaluation metrics for regression
      - MAE, MSE and coefficient of determination
    3. Training and testing
      - Splitting data
      - K-fold cross validation
      - Leave-one-out cross validation
  2. Modal Validation II
    1. Bias-variance trade-off
    2. Validation curve
    3. Learning curve
    4. Hyperparameter search
      - Grid search and random search

 

Lesson 6

  1. Handling Imbalanced dataset
    1. Choosing the right metrics
    2. Resampling
      - Random sampling
      - Undersampling
       (1) Tomek Links
       (2) Near-Miss
      - Oversampling
       (1) SMOTE
       (2) ADASYN
  2. Putting it all together – survival analysis
    1. Introduction to the problem
    2. Exploratory Data Analysis (EDA)
    3. Filling missing data
    4. Feature engineering
    5. Classification with random forest
    6. Hyperparameter tuning

Registration: click here

 

Pilot HPC Platform

The Pilot High Performance Computing (HPC) Platform allows PolyU research staff and students to test or develop their applications for research purpose. This workshop aims at getting users familiar with the platform and features, and how to leverage Pilot HPC Platform for research activities.  

Date: 17 Sep (Fri)
Time: 14:30 – 17:00
Venue: Online and On-site
Pre-requisite: Basic Linux and programming knowledge
Target Audience: Research staff & Rpg, Tpg, Ug Students
Medium of Instruction: English
Course Outline:

  1. Introduction of Pilot HPC Platform
  2. Account application
  3. Resources available
  4. Core operations of using Pilot HPC Platform
  5. Application examples running on Pilot HPC Platform
  6. Jupyter Notebook on Pilot HPC Platform

Registration: click here

 

Your browser is not the latest version. If you continue to browse our website, Some pages may not function properly.

You are recommended to upgrade to a newer version or switch to a different browser. A list of the web browsers that we support can be found here