SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

Silvio Giancola; Mohieddine Amine; Tarek Dghaily; Bernard Ghanem

DOI:10.1109/CVPRW.2018.00223
Corpus ID: 5047207

SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

@article{Giancola2018SoccerNetAS,
  title={SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos},
  author={Silvio Giancola and Mohieddine Amine and Tarek Dghaily and Bernard Ghanem},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  year={2018},
  pages={1792-179210},
  url={https://api.semanticscholar.org/CorpusID:5047207}
}

Silvio GiancolaMohieddine Amine Bernard Ghanem
Published in IEEE/CVF Conference on… 12 April 2018
Computer Science
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

This paper introduces SoccerNet, a benchmark for action spotting in soccer videos, and shows that the best model for classifying temporal segments of length one minute reaches a mean Average Precision (mAP) of 67.8%.

[PDF] Semantic Reader

125 Citations

Highly Influential Citations

Background Citations

Methods Citations

Results Citations

Figures and Tables from this paper

Topics

SoccerNet Action Spotting Yellow/Red Card Soccer Videos European Leagues Soccer Rules Event Spotting Card Event Sports Video Understanding Soccer Events

Ask This Paper
BETA
AI-Powered

Our system tries to constrain to information found in this paper. Results quality may vary. Learn more about how we generate these answers.

Feedback?

A Context-Aware Loss Function for Action Spotting in Soccer Videos

A. CioppaA. Deliège T. Moeslund

Computer Science

2020 IEEE/CVF Conference on Computer Vision and…

2020

This paper proposes a novel loss function that specifically considers the temporal context naturally present around each action, rather than focusing on the single annotated frame to spot, and demonstrates the generalization capability of this loss for generic activity proposals and detection on ActivityNet.

[PDF]

SoccerDB: A Large-Scale Database for Comprehensive Video Understanding

Yudong JiangKaixu CuiLeilei ChenCanjin WangChangliang Xu

Computer Science

MMSports@MM

2020

This paper proposes a new soccer video database named SoccerDB, comprising 171,191 video segments from 346 high-quality soccer games, which is the largest database for comprehensive sports video understanding on various aspects.

SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos

A. DeliègeA. Cioppa Marc Van Droogenbroeck

Computer Science

2021 IEEE/CVF Conference on Computer Vision and…

2021

This work proposes SoccerNet-v2, a novel large-scale corpus of manual annotations for the SoccerNet video dataset, along with open challenges to encourage more research in soccer understanding and broadcast production, and extends current tasks in the realm of soccer to include action spotting, camera shot segmentation with boundary detection, and a novel replay grounding task.

[PDF]

Improved Soccer Action Spotting using both Audio and Video Streams

Bastien VanderplaetseS. Dupont

Computer Science

2020 IEEE/CVF Conference on Computer Vision and…

2020

This work used the SoccerNet benchmark dataset, which contains annotated events for 500 soccer game videos from the Big Five European leagues, and evaluated several ways to integrate audio stream into video-only-based architectures.

[PDF]

A Graph-Based Method for Soccer Action Spotting Using Unsupervised Player Classification

Alejandro CartasC. BallesterG. Haro

Computer Science

MMSports@MM

2022

This work identifies and representing the players, referees, and goalkeepers as nodes in a graph, and modeling their temporal interactions as sequences of graphs, and obtains an overall performance that surpasses similar graph-based methods and has competitive results with heavy computing methods.

[PDF]

STE: Spatio-Temporal Encoder for Action Spotting in Soccer Videos

Abdulrahman DarwishTallal El Shabrway

Computer Science

MMSports@MM

2022

A modified version of the Spatio-Temporal Encoder (STE) model is introduced: STE-v2 that improved the tight a-mAP to reach 58.71% on the challenge split and 58.48%" on the test split.

SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries

Hassan MkhallatiA. CioppaSilvio GiancolaBernard GhanemMarc Van Droogenbroeck

Computer Science

2023 IEEE/CVF Conference on Computer Vision and…

2023

A novel task of dense video captioning focusing on the generation of textual commentaries anchored with single times-tamps that has the potential to enhance the accessibility and understanding of soccer content for a wider audience and bring the excitement of the game to more people.

[PDF]

Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection

Xin ZhouLe KangZhiyu ChengBo HeJingyu Xin

Computer Science

ArXiv

2021

This tech report presents a two-stage paradigm to detect what and when events happen in soccer broadcast videos, fine-tune multiple action recognition models on soccer data to extract high-level semantic features, and design a transformer based temporal detection module to locate the target events.

[PDF]

RMS-Net: Regression and Masking for Soccer Event Spotting

Matteo TomeiL. BaraldiS. CalderaraSimone BronzinR. Cucchiara

Computer Science

2020 25th International Conference on Pattern…

2021

A lightweight and modular network which can simultaneously predict the event label and its temporal offset using the same underlying features is devised, which reaches a gain of more than 10 Average-mAP points on the test set when fine-tuned in combination with a strong 2D backbone.

[PDF]

Temporally Precise Action Spotting in Soccer Videos Using Dense Detection Anchors

J. C. V. SoaresAvijit ShahTopojoy Biswas

Computer Science

2022 IEEE International Conference on Image…

2022

A model for temporally precise action spotting in videos uses a dense set of detection anchors, predicting a detection confidence and corresponding fine-grained temporal displacement for each anchor, and experiment with two trunk architectures, one of which is a one-dimensional version of a u-net, and a Transformer encoder (TE).

[PDF]

Dense-Captioning Events in Videos

Ranjay KrishnaK. HataF. RenLi Fei-FeiJuan Carlos Niebles

Computer Science

2017 IEEE International Conference on Computer…

2017

This work proposes a new model that is able to identify all events in a single pass of the video while simultaneously describing the detected events with natural language, and introduces a new captioning module that uses contextual information from past and future events to jointly describe all events.

[PDF]

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

João CarreiraAndrew Zisserman

Computer Science

2017 IEEE Conference on Computer Vision and…

2017

I3D models considerably improve upon the state-of-the-art in action classification, reaching 80.2% on HMDB-51 and 97.9% on UCF-101 after pre-training on Kinetics, and a new Two-Stream Inflated 3D Conv net that is based on 2D ConvNet inflation is introduced.

[PDF]

Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos

Fabian Caba HeilbronJuan Carlos NieblesBernard Ghanem

Computer Science

2016 IEEE Conference on Computer Vision and…

2016

This paper introduces a proposal method that aims to recover temporal segments containing actions in untrimmed videos and introduces a learning framework to represent and retrieve activity proposals.

Automatic Soccer Video Analysis and Summarization

A. EkinA. TekalpR. Mehrotra

Computer Science

IS&T/SPIE Electronic Imaging

2003

A fully automatic and computationally efficient framework for analysis and summarization of soccer videos using cinematic and object-based features, which includes some novel low-level soccer video processing algorithms, as well as some higher-level algorithms for goal detection, referee detection, and penalty-box detection.

Soccer Video Event Annotation by Synchronization of Attack–Defense Clips and Match Reports With Coarse-Grained Time Information

Zengkai WangJunqing YuYunfeng He

Computer Science

IEEE Transactions on Circuits and Systems for…

2017

A more generalized approach that synchronizes video events with text descriptions using high-level semantics with coarse time constraints, rather than assuming that the timestamp is given exactly in the text description.

Automatic Soccer Video Event Detection Based on a Deep Neural Network Combined CNN and RNN

Haohao JiangYao LuJing Xue

Computer Science

2016 IEEE 28th International Conference on Tools…

2016

A deep neural network is constructed to detect soccer video event and uses RNN to map the semantic features of key frames from PB to soccer event types, including goal, goal attempt, card and corner.

Detecting Events and Key Actors in Multi-person Videos

Vignesh RamanathanJonathan HuangSami Abu-El-HaijaAlexander N. GorbanK. MurphyLi Fei-Fei

Computer Science

2016 IEEE Conference on Computer Vision and…

2016

This paper proposes a model which learns to detect events in videos while automatically "attending" to the people responsible for the event, and outperforms state-of-the-art methods for both event classification and detection on this new dataset.

[PDF]

Goal!! Event detection in sports video

G. TsagkatakisM. JaberP. Tsakalides

Computer Science

Computer Vision Applications in Sports

2017

Experimental results demonstrate that extremely high classiﬁcation accuracy can be achieved, from a dramatically limited number of examples, by leveraging pre-trained models with fusion of spatio-temporal features.

Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs

Zheng ShouDongang WangShih-Fu Chang

Computer Science

2016 IEEE Conference on Computer Vision and…

2016

A novel loss function for the localization network is proposed to explicitly consider temporal overlap and achieve high temporal localization accuracy in untrimmed long videos.

Leveraging Contextual Cues for Generating Basketball Highlights

Vinay BettadapuraC. PantofaruIrfan Essa

Computer Science

ACM Multimedia

2016

The informativeness of five different cues derived from the video and from the environment are explored through user studies and show that for study participants, the highlights produced by the system are comparable to the ones produced by ESPN for the same games.

[PDF]

SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

Figures and Tables from this paper

Topics

Ask This PaperBETAAI-Powered

125 Citations

A Context-Aware Loss Function for Action Spotting in Soccer Videos

SoccerDB: A Large-Scale Database for Comprehensive Video Understanding

SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos

Improved Soccer Action Spotting using both Audio and Video Streams

A Graph-Based Method for Soccer Action Spotting Using Unsupervised Player Classification

STE: Spatio-Temporal Encoder for Action Spotting in Soccer Videos

SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries

Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection

RMS-Net: Regression and Masking for Soccer Event Spotting

Temporally Precise Action Spotting in Soccer Videos Using Dense Detection Anchors

85 References

Dense-Captioning Events in Videos

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos

Automatic Soccer Video Analysis and Summarization

Soccer Video Event Annotation by Synchronization of Attack–Defense Clips and Match Reports With Coarse-Grained Time Information

Automatic Soccer Video Event Detection Based on a Deep Neural Network Combined CNN and RNN

Detecting Events and Key Actors in Multi-person Videos

Goal!! Event detection in sports video

Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs

Leveraging Contextual Cues for Generating Basketball Highlights

Related Papers

Ask This Paper
BETA
AI-Powered