Speech is composed of phomenes. This task is to predict the associated phone sequences given the acoustic signal.
Read assignment page Read assignment slidesDescribe the videos based on content.
This task is to generate the description for the given video.
Game playing is an interaction between a user and an environment. Reinforcement learning can learn the agent's action for plying the games.
Task 1: implement a value-based approach for the agent to play the game.
Task 2: implement a policy-based approach for the agent to play the game.
Conditional image generation is to automatically generate natural images based on the given constraints.