
Johns Hopkins Information Technology offers a selection of artificial intelligence platforms and large language models (LLMs) that are approved for use in research projects. Johns Hopkins IT’s AI offerings are designed to increase the efficiency of investigative teams’ workflows and support overall research processes and pipelines. From data analysis and experimentation, to predictive modeling and automation, Johns Hopkins IT’s AI platforms aim to advance the future of research at Johns Hopkins.
Guidance for Using AI in Research
Researchers must adhere to institutional and federal policies, including IRB, HIPAA, FERPA, and export control regulations. Projects involving human subjects or sensitive data require early engagement with the Johns Hopkins Medicine IRB, the Bloomberg School of Public Health IRB, or the Homewood IRB. International data transfers and cloud services must meet JHU security standards. Any use of generative AI must include rigorous human verification of outputs.
Johns Hopkins IT provides the following Research IT AI services*:
*We are continuously reviewing and validating AI tools for investigators to use to support their research projects. Please check this page often to view the latest approved tools and guidance!
HopGPT
Research IT recommends investigators begin their journey with deploying AI within their research projects by using HopGPT–a private, secure, generative AI platform that provides Johns Hopkins-affiliated individuals with authentication to a selection of large language models (LLMs) from leading providers including OpenAI, Anthropic, and Meta.
Developed and managed by Johns Hopkins Information Technology, HopGPT is best for document analysis, research workflows, and access to multiple AI models via chat and Application Programming Interfaces (APIs).
HopGPT features a user-friendly web-based chat interface for general inquiries and API access points for programmatic integration of multiple LLM capabilities–including LLaMA 4, Claude Sonnet, GPT-4o, and GPT-5–into data workflows and analytical processes.
Access
HopGPT is accessible to anyone with a JHED ID.
Cost
Visit the HopGPT Support page to learn more.
Customization & Control
HopGPT is highly customizable and acts a research “sandbox” that supports private fine-tuning and experimentation.
Data Security
HopGPT operates within a privacy-preserving environment that is approved for handling Protected Health Information (PHI) and Personally Identifiable Information (PII). These safeguards align with Johns Hopkins’ institutional standards for privacy, security, and compliance.
Integration Needs
HopGPT can be accessed through a web-based interface or by programmatic API.
Limitations
HopGPT is not designed for model training, batch pipelines, or automated production workloads.
Does your project need something outside of the scope of HopGPT? Try out these tools!
Cloud-based AI at JHU
JHU provides cloud-based AI tools through Microsoft Azure’s suite of AI and machine learning services, including Azure OpenAI, Azure Machine Learning, and Azure AI Foundry. These tools are deployed within JHU’s NIST 800-171-aligned Azure environment. This configuration enables the use of advanced AI capabilities within a secure, compliant, and enterprise-managed infrastructure.
JHU’s cloud-based AI tools are integrated with the university’s existing research computing services and supports both pre-trained models such as GPT-3.5, GPT-4, and GPT-5, and custom model training workflows. This approach delivers scalable AI capabilities while maintaining institutional security controls, compliance standards, and data governance requirements.
Access
These cloud-based AI tools are available to JHU via an Research IT-managed Azure subscription.
Cost
JHU offers a pay-as-you-go cost model for its cloud-based AI tools.
Customization & Control
Azure’s AI services are highly customizable and support custom endpoints, LLM deployment, model development, and automation.
Data Security
JHU’s cloud-based AI tools rely on both institution-negotiated BAAs and a secure architectural implementation, combining contractual protections with technical controls that meet HIPAA and NIST 800-171 requirements.
Integration Needs
JHU’s cloud-based AI tools operate within the broader Azure platform and leverage standard capabilities for storage, hosting, API management, networking, and security to support complete research workflows. This allows AI workloads to be combined with the same Azure components used across the institution for data handling and application delivery. These services also connect to existing JHU-managed platforms, enabling researchers to build end-to-end AI workflows that align with current systems and governance.
Limitations
Use of Azure’s AI services may require familiarity with the Azure environment. Oversight of spending is also needed to manage cost. Use of Azure AI may also require additional IRB or IT approvals for sensitive data use.
AI Tools on ARCH
Advanced Research Computing at Hopkins (ARCH)–which is comprised of the DSAI and Rockfish high-performance computing clusters–can support a variety of compute-intensive AI workloads in the context of open science, such as those conducted within the Data Science and Artificial Intelligence Institute.
Access
Authorized users–including faculty, staff, students, and trainees–at JHU can request access to DSAI or Rockfish via ARCH’s allocation and account request system.
Once approved, users will receive user accounts and project allocations that grant access to compute (CPU/GPU) and storage resources.
Users must abide by ARCH’s policies on security, resource usage, privacy, and good citizenship.
Customization & Control
ARCH clusters support a flexible software environment via a module system (Lmod), as well as containers (e.g., Singularity).
Researchers can build and deploy custom ML/AI workflows, such as setting up custom Python environments, using Jupyter or Studio interactively, running batch jobs with scheduling via Slurm, managing dependencies via modules or containers, and more.
ARCH supports both CPU-based and GPU-accelerated workloads with a variety of GPU node types, allowing investigators to tailor compute resources to specific tasks (e.g. training large models, fine-tuning, inference, data preprocessing, etc.).
Data Security
The clusters housed at ARCH are not approved for use with any regulated or controlled data, meaning nothing that requires an IRB interaction–no HIPAA, NIST, or CUI data.
Both clusters operated by ARCH support computational research across all disciplines, including data- and compute-intensive workflows. Storage is provided via a high-performance parallel file system (WEKA) with approximately 5 PB of capacity.
As with any shared HPC resource, users are subject to ARCH’s security, privacy, and usage policies.
Integration Needs
Users can transfer data via standard HPC mechanisms (file transfer tools, network, etc.), and manage projects via the allocation/account system.
For AI projects: use of containers or software modules means researchers can incorporate external libraries (PyTorch, TensorFlow, etc.), custom code, datasets, and more — subject to resource and security constraints.
Limitations
As with all shared clusters, investigators may face queue wait times, resource limits like GPU availability, and storage quota restraints.
AI Tools on DISCOVERY
DISCOVERY is Research IT’s centrally-managed high-performance computing (HPC) cluster that provides scalable and secure compute resources for data- and compute-intensive research. The platform supports researcher-defined AI and machine learning workflows, enabling the training, fine-tuning, and evaluation of open-source models (such as those available on HuggingFace) using tools such as PyTorch, TensorFlow, and scikit-learn.
DISCOVERY provides the hardware, drivers, and software modules necessary to support researcher-developed workflows. Research IT facilitators can provide guidance and technical support for enabling these environments.
DISCOVERY operates in a NIST 800-171-compliant environment, making it suitable for research involving regulated or sensitive data. It is well-suited for data processing, feature extraction, and simulation-based modeling, and it can also serve as a foundation for hybrid AI workflows that bridge on-premise compute with cloud-based AI environments.
Access
DISCOVERY (HPC) is available to JHU faculty, staff, students, and trainees working on a research project under a sponsoring PI.
Customization & Control
DISCOVERY’s AI tools are highly customizable, supporting containerized workflows as well as custom ML and AI pipelines.
Data Security
DISCOVERY (HPC) is NIST 800-171 compliant, suitable for regulated or sensitive research data, and has restricted/limited network access.
Integration Needs
DISCOVERY is integrated with JHU’s research storage and networks, enabling large-scale data access and collaboration.
Limitations
Job scheduling and queue times can vary on DISCOVERY depending on system load. DISCOVERY also requires technical familiarity with Linux, containers, and Slurm job submissions, and is not designed for integration with commercial API or SaaS-based AI services.
(NEW!) AI Transcription Service
Research IT’s AI Transcription Service provides a secure solution for transcribing audio and video recordings into text using Microsoft Foundry’s Batch Speech-to-Text service.
Hosted in a Research IT-managed, PII/PHI-compliant Azure subscription, this service is designed for Hopkins researchers who need accurate transcription with speaker separation and data security compliance, with minimal technical burden and cost.
Access
The AI Transcription Service is available to all Johns Hopkins researchers.
Cost
$15/month* per user or group + $0.20/hour of audio processed.
*Note: Accounts are only billed in months the service is used (billing cycles run the 16th–15th).
Data Security
The AI Transcription Service is safe to use with PHI/PII data. The service is not NIST-compliant.
Specifications
- Supports the following recorded file types:
- Audio: mp3, wav, m4a, flac, ogg, wma, and aac
- Video: mp4, mov, avi, mkv, wmv, webm, m4v, 3gp, and ts
- Transcripts are outputted as .txt and .docx files
- Currently supports English language recordings only
- Speaker separation has configurable maximum speaker count
User Workflow
Users upload recordings to their dedicated storage container via Azure Storage Explorer. Recording are automatically processed through the AI speech-to-text model, and transcripts are outputted to the storage container for download.
Recordings are automatically purged after processing. Transcripts and any other files are automatically purged after 30 days.