What skills should you be building for a career as a data scientist in Digital Agriculture?
Everything we do has a spatial and temporal component. These days GIS skills are a big plus!
I often have students come to me saying “I’m passionate about the agri-food sector and I want to do data science. What skills do I need to land a satisfying job, and ultimately a career, in this area?” So I figured I’d take the opportunity to share what I typically say to them here, for the benefit of all those out there with similar goals and interests. Key items are in bold.
First off, there is so much unstructured data out there, in disparate formats and locations that you aren’t going to get far without a programming language under your belt. And in this field, that really boils down to Python and/or R. Sure, Chat-GPT can write code, but trust me, it’s not there yet.
Next, everything we do has a spatial and temporal component. These days GIS skills are a big plus! You don’t have to be a GIS expert, but you should know what a coordinate reference system and datum are, understand the limitations and practicalities of aggregation and disaggregation to different levels of resolution, and be facile with vector and raster manipulations.
Data science applied to any domain involves statistics and modeling. Moving beyond point estimates with p-values and understanding Bayesian statistics will get you far. And an understanding of databases (e.g., relational, graph, or columnar) can also be useful.
Finally, many datasets are incredibly large, so analyses often can’t be performed on your laptop. So familiarity with doing analyses on High-Performance Computing (HPC) infrastructure can be critical. These systems have mechanisms in place for you to schedule jobs to be run across multiple processors in tandem with other people’s jobs, and there are conventions and rules of etiquette for that.
Depending on the positions you have in your wish list, you may want to be sure to have either a Masters level degree or Ph.D. Masters should suffice if you’re happy to have someone else identify and devise the scope of the problems you work on. If you want to do pure R&D and define your own problems, a Ph.D. will likely be necessary.
If you're lacking some of these skills or qualifications, there are numerous data science degrees offered across the country, as well as short instructional modules such as those included in GEMS Learning.