Installation#
This guide covers how to install and configure the geodata package for local and cloud use.
Make sure that you have the following required software set up:
Downloading Geodata#
To download geodata, open a terminal/shell window navigate to your preferred working directory, and run the following: (If you do not have Git installed, you may also directly download it with this link.
git clone https://github.com/GeodataTools/geodata.git
cd geodata
Configuring File Storage Location#
To configure where to store downloaded and processed files, define an environment variable called GEODATA_ROOT and save in your shell configuration files, such as .bashrc or .zshrc:
export GEODATA_ROOT=<YOUR_PATH_HERE>
Note
If you are using a Windows machine, you can set the environment variable by running the following command in the command prompt:
setx GEODATA_ROOT <YOUR PATH HERE>
If you are running geodata in a Jupyter Notebook, you could define the variable by adding and running the following cell:
%setenv GEODATA_ROOT <YOUR PATH HERE>
If you do not define this variable, all datasets and cutouts will be stored under ~/.local/geodata by default.
Building Geodata#
Anaconda/miniconda Environment#
Anaconda/miniconda is a powerful package manager and environment manager for Windows, macOS or Linux, and it provides easy installation for all operating systems. It is especially convenient if you are building Geodata on the cloud with potential installation permission issues.
If you already have Anaconda/miniconda installed on your machine, jump straight to the conda activate step. Otherwise, you have 2 options: download Anaconda or miniconda. Installing Anaconda requires >3GB disk space and takes minutes to download, so we will choose miniconda instead because is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages.
If you have conda 4.6 or later versions, in the terminal/shell, run the following command below to activate the conda environment.
conda activate <ENVIRONMENT_NAME>
To use geodata in Python scripts by calling import geodata, you’ll need to build the package. To do so, in the terminal/shell window, navigate to the package’s root directory (ie, “geodata”), and run the following:
pip install .
Note: If running pip install . generates errors related to being unable to install the rasterio package due to conflicts with incompatible packages, you may need to reinstall Anaconda/miniconda depending on what you went with during setup. Then run the following commands:
conda update --all
conda install rasterio
Installation with pip#
macOS/Linux#
In macOS or Linux’s terminal, navigate to the package’s root directory, and run the following:
pip install .
Note: All dependencies should install automatically upon building the package, with possible exceptions such as the rasterio library, which requires Cython and GDAL. For the source of these instructions and more documentation about rasterio, see the rasterio documentation.
If one of the dependency, such as rasterio does not install automatically (we know this through the error message from the command above), we will have to install it seperately in the terminal:
pip install rasterio
Once the library above is successfully installed, re-run the installation command above to build Geodata:
pip install .
If there is an error message regarding one of Geodata’s dependency, repeat the process and use pip install to seperately download it.
Windows#
In the Windows command prompt, navigate to the package’s root directory, and run the following:
pip install .
Note: All dependencies should install automatically upon building the package, with possible exceptions such as the rasterio library, which requires other dependencies.
If one of the dependency, such as GDAL does not install automatically (we know this through the error message from the command above), we will have to install it seperately in the terminal. There are 2 options to solve this issue. Once we download the required dependency successfully, we can proceed by re-run the install command:
pip install .
Pipwin#
This option is recommended. When we download libraries that is built on GDAL, we might run into this issue, where a GDAL API version must be specified.
Pipwin is a complementary tool for pip on Windows. pipwin installs unofficial python package binaries for windows provided by Christoph Gohlke here.
Run the following commands to download pipwin and acquire the dependencies: (This solution is adopted from stackoverflow)
pip install pipwin
pipwin install shapely
pipwin install gdal
pipwin install fiona
pipwin install pyproj
pipwin install six
pipwin install rtree
You may need to also install the wheel package pip install wheel to facilitate building the wheels.
Similarly, if you run into installation errors regarding the rasterio or bottleneck packages, you can also call pipwin install rasterio or pipwin install bottleneck to download them.
Direct Wheel Install#
To install rasterio and the necessary GDAL library, we can download the appropriate binaries for your system by hand (rasterio and GDAL) , place them into the current working directory, and run the following command in the downloads folder:
pip install -U pip
pip install {GDAL binary name here}.whl
pip install {rasterio binary name here}.whl
You may also need to also install the wheel package pip install wheel to facilitate building the wheels.
For more documentation/troubleshooting conda installation issues, see here