Research Information Technology

Johns Hopkins Information Technology offers a selection of artificial intelligence platforms and large language models (LLMs) that are approved for use in research projects. Johns Hopkins IT’s AI offerings are designed to increase the efficiency of investigative teams’ workflows and support overall research processes and pipelines. From data analysis and experimentation, to predictive modeling and automation, Johns Hopkins IT’s AI platforms aim to advance the future of research at Johns Hopkins.

Guidance for Using AI in Research

Researchers must adhere to institutional and federal policies, including IRB, HIPAA, FERPA, and export control regulations. Projects involving human subjects or sensitive data require early engagement with the Johns Hopkins Medicine IRB, the Bloomberg School of Public Health IRB, or the Homewood IRB. International data transfers and cloud services must meet JHU security standards. Any use of generative AI must include rigorous human verification of outputs.

Johns Hopkins IT provides the following Research IT AI services*:

*We are continuously reviewing and validating AI tools for investigators to use to support their research projects. Please check this page often to view the latest approved tools and guidance!

HopGPT

Short Separator

Research IT recommends investigators begin their journey with deploying AI within their research projects by using HopGPT–a private, secure, generative AI platform that provides Johns Hopkins-affiliated individuals with authentication to a selection of large language models (LLMs) from leading providers including OpenAI, Anthropic, and Meta.

Developed and managed by Johns Hopkins Information Technology, HopGPT is best for document analysis, research workflows, and access to multiple AI models via chat and Application Programming Interfaces (APIs).

HopGPT features a user-friendly web-based chat interface for general inquiries and API access points for programmatic integration of multiple LLM capabilities–including LLaMA 4, Claude Sonnet, GPT-4o, and GPT-5–into data workflows and analytical processes.

Get started with hopgpt

Access

HopGPT is accessible to anyone with a JHED ID.

Cost

Visit the HopGPT Support page to learn more.

Customization & Control

HopGPT is highly customizable and acts a research “sandbox” that supports private fine-tuning and experimentation.

Data Security

HopGPT operates within a privacy-preserving environment that is approved for handling Protected Health Information (PHI) and Personally Identifiable Information (PII). These safeguards align with Johns Hopkins’ institutional standards for privacy, security, and compliance.

Integration Needs

HopGPT can be accessed through a web-based interface or by programmatic API.

Limitations

HopGPT is not designed for model training, batch pipelines, or automated production workloads.

Does your project need something outside of the scope of HopGPT? Try out these tools!

Short Separator

Cloud-based AI at JHU

JHU provides cloud-based AI tools through Microsoft Azure’s suite of AI and machine learning services, including Azure OpenAI, Azure Machine Learning, and Azure AI Foundry. These tools are deployed within JHU’s NIST 800-171-aligned Azure environment. This configuration enables the use of advanced AI capabilities within a secure, compliant, and enterprise-managed infrastructure.

JHU’s cloud-based AI tools are integrated with the university’s existing research computing services and supports both pre-trained models such as GPT-3.5, GPT-4, and GPT-5, and custom model training workflows. This approach delivers scalable AI capabilities while maintaining institutional security controls, compliance standards, and data governance requirements.

Get started with azure

Access

These cloud-based AI tools are available to JHU via an Research IT-managed Azure subscription.

Cost

JHU offers a pay-as-you-go cost model for its cloud-based AI tools.

Customization & Control

Azure’s AI services are highly customizable and support custom endpoints, LLM deployment, model development, and automation.

Data Security

JHU’s cloud-based AI tools rely on both institution-negotiated BAAs and a secure architectural implementation, combining contractual protections with technical controls that meet HIPAA and NIST 800-171 requirements.

Integration Needs

JHU’s cloud-based AI tools operate within the broader Azure platform and leverage standard capabilities for storage, hosting, API management, networking, and security to support complete research workflows. This allows AI workloads to be combined with the same Azure components used across the institution for data handling and application delivery. These services also connect to existing JHU-managed platforms, enabling researchers to build end-to-end AI workflows that align with current systems and governance.

Limitations

Use of Azure’s AI services may require familiarity with the Azure environment. Oversight of spending is also needed to manage cost. Use of Azure AI may also require additional IRB or IT approvals for sensitive data use.

AI Tools on ARCH

Advanced Research Computing at Hopkins (ARCH)–which is comprised of the DSAI and Rockfish high-performance computing clusters–can support a variety of compute-intensive AI workloads in the context of open science, such as those conducted within the Data Science and Artificial Intelligence Institute.

get started with arch

Access

Authorized users–including faculty, staff, students, and trainees–at JHU can request access to DSAI or Rockfish via ARCH’s allocation and account request system.

Once approved, users will receive user accounts and project allocations that grant access to compute (CPU/GPU) and storage resources.

Users must abide by ARCH’s policies on security, resource usage, privacy, and good citizenship.

Customization & Control

ARCH clusters support a flexible software environment via a module system (Lmod), as well as containers (e.g., Singularity).

Researchers can build and deploy custom ML/AI workflows, such as setting up custom Python environments, using Jupyter or Studio interactively, running batch jobs with scheduling via Slurm, managing dependencies via modules or containers, and more.

ARCH supports both CPU-based and GPU-accelerated workloads with a variety of GPU node types, allowing investigators to tailor compute resources to specific tasks (e.g. training large models, fine-tuning, inference, data preprocessing, etc.).

Data Security

The clusters housed at ARCH are not approved for use with any regulated or controlled data, meaning nothing that requires an IRB interaction–no HIPAA, NIST, or CUI data.

Both clusters operated by ARCH support computational research across all disciplines, including data- and compute-intensive workflows. Storage is provided via a high-performance parallel file system (WEKA) with approximately 5 PB of capacity.

As with any shared HPC resource, users are subject to ARCH’s security, privacy, and usage policies.

Integration Needs

Users can transfer data via standard HPC mechanisms (file transfer tools, network, etc.), and manage projects via the allocation/account system.

For AI projects: use of containers or software modules means researchers can incorporate external libraries (PyTorch, TensorFlow, etc.), custom code, datasets, and more — subject to resource and security constraints.

Limitations

As with all shared clusters, investigators may face queue wait times, resource limits like GPU availability, and storage quota restraints.

AI Tools on DISCOVERY

DISCOVERY is Research IT’s centrally-managed high-performance computing (HPC) cluster that provides scalable and secure compute resources for data- and compute-intensive research. The platform supports researcher-defined AI and machine learning workflows, enabling the training, fine-tuning, and evaluation of open-source models (such as those available on HuggingFace) using tools such as PyTorch, TensorFlow, and scikit-learn.

DISCOVERY provides the hardware, drivers, and software modules necessary to support researcher-developed workflows. Research IT facilitators can provide guidance and technical support for enabling these environments.

DISCOVERY operates in a NIST 800-171-compliant environment, making it suitable for research involving regulated or sensitive data. It is well-suited for data processing, feature extraction, and simulation-based modeling, and it can also serve as a foundation for hybrid AI workflows that bridge on-premise compute with cloud-based AI environments.

Get started with discovery

Access

DISCOVERY (HPC) is available to JHU faculty, staff, students, and trainees working on a research project under a sponsoring PI.

Customization & Control

DISCOVERY’s AI tools are highly customizable, supporting containerized workflows as well as custom ML and AI pipelines.

Data Security

DISCOVERY (HPC) is NIST 800-171 compliant, suitable for regulated or sensitive research data, and has restricted/limited network access.

Integration Needs

DISCOVERY is integrated with JHU’s research storage and networks, enabling large-scale data access and collaboration.

Limitations

Job scheduling and queue times can vary on DISCOVERY depending on system load. DISCOVERY also requires technical familiarity with Linux, containers, and Slurm job submissions, and is not designed for integration with commercial API or SaaS-based AI services.

(NEW!) AI Transcription Service

Research IT’s AI Transcription Service provides a secure solution for transcribing audio and video recordings into text using Microsoft Foundry’s Batch Speech-to-Text service.

Hosted in a Research IT-managed, PII/PHI-compliant Azure subscription, this service is designed for Hopkins researchers who need accurate transcription with speaker separation and data security compliance, with minimal technical burden and cost.

Get started with AI Transcription

Access

The AI Transcription Service is available to all Johns Hopkins researchers.

Cost

$15/month* per user or group + $0.20/hour of audio processed.

*Note: Accounts are only billed in months the service is used (billing cycles run the 16th–15th).

Data Security

The AI Transcription Service is safe to use with PHI/PII data. The service is not NIST-compliant.

Specifications

Supports the following recorded file types:
- Audio: mp3, wav, m4a, flac, ogg, wma, and aac
- Video: mp4, mov, avi, mkv, wmv, webm, m4v, 3gp, and ts
Transcripts are outputted as .txt and .docx files
Currently supports English language recordings only
Speaker separation has configurable maximum speaker count

User Workflow

Users upload recordings to their dedicated storage container via Azure Storage Explorer. Recording are automatically processed through the AI speech-to-text model, and transcripts are outputted to the storage container for download.

Recordings are automatically purged after processing. Transcripts and any other files are automatically purged after 30 days.