关于 Statsmodels Statsmodels 0 14 4 Statsmodels 文档

Leo Migdal

-Dec 4, 2025, 9:14 AM

关于 statsmodels statsmodels 0 14 4 statsmodels 文档

The models module of scipy.stats was originally written by Jonathan Taylor. For some time it was part of scipy but was later removed. During the Google Summer of Code 2009, statsmodels was corrected, tested, improved and released as a new package. Since then, the statsmodels development team has continued to add new models, plotting tools, and statistical methods. 大多数结果至少在一个其他统计软件包中得到验证：R、Stata 或 SAS。最初重写和持续开发的指导原则是一切数字都必须得到验证。一些统计方法使用蒙特卡罗研究进行测试。虽然我们努力遵循这种测试驱动的方法，但不能保证代码没有错误且始终有效。一些辅助函数测试不足，一些边缘情况可能没有正确考虑，并且许多统计模型中固有的数值问题可能性。我们特别感谢对此类问题的任何帮助和报告，以便我们不断改进现有模型。现有模型的用户界面基本已稳定，我们预计未来不会出现太多重大变化。对于现有代码，虽然还没有关于 API 稳定性的保证，但我们在除极少数情况外，都有很长的弃用期，并且尽量将需要现有用户调整的更改降至最低。对于较新的模型，我们可能会在获得更多经验和反馈后调整用户界面。这些更改将在我们的文档中发布的发布说明中列出。

如果您遇到错误或意外行为，请在 issue tracker 上报告。使用 show_versions 命令列出 statsmodels 及其依赖项的已安装版本。 Google www.google.com : Google Summer of Code (GSOC) 2009-2017。 statsmodels is using github to store the updated documentation. Two version are available: Development, the latest build of the main branch API stability is not guaranteed for new features, although even in this case changes will be made in a backwards compatible way if possible.

The stability of a new feature depends on how much time it was already in statsmodels main and how much usage it has already seen. If there are specific known problems or limitations, then they are mentioned in the docstrings. This release bring official Pyodide support to a statsmodel release. It is otherwise identical to the previous release. Special thanks to Agriya Khetarpal for working through Pyodide-specific issues, and improving other areas of statsmodels while doing so. statsmodels 提供数据集（即数据 *和* 元数据）供在示例、教程、模型测试等中使用。

该 Rdatasets 项目提供对 R 的核心数据集包以及许多其他常用 R 包中可用数据集的访问。所有这些数据集都可通过使用 get_rdataset 函数供 statsmodels 使用。实际数据可通过 data 属性访问。例如 get_rdataset(dataname[, package, cache]) 该 Dataset 对象遵循 bunch 模式。完整数据集可在 data 属性中获得。如果数据集没有对什么是 endog 和 exog 的明确解释，那么您始终可以访问 data 或 raw_data 属性。这适用于 macrodata 数据集，它是一组美国宏观经济数据，而不是具有特定示例的特定数据集。 data 属性包含完整数据集的记录数组，raw_data 属性包含一个 ndarray，其中列的名称由 names 属性给出。 statsmodels 是一个 Python 模块，提供用于估计各种统计模型的类和函数，以及用于进行统计检验和统计数据探索的类和函数。每个估计器都提供广泛的统计结果列表。结果经过测试，与现有的统计包进行比较，以确保其正确性。该包是在开源的 Modified BSD (3-clause) 许可下发布的。在线文档托管在 statsmodels.org。 statsmodels 支持使用 R 风格公式和 pandas DataFrame 来指定模型。以下是一个使用普通最小二乘法的简单示例

查看 dir(results) 以查看可用的结果。属性在 results.__doc__ 中描述，结果方法有自己的文档字符串。 Seabold, Skipper 和 Josef Perktold。 "statsmodels：Python 的计量经济学和统计建模。” 第九届 Python in Science 大会论文集。 2010 年。这个非常简单的案例研究旨在帮助您快速上手使用 statsmodels。从原始数据开始，我们将展示估计统计模型和绘制诊断图所需的步骤。我们只使用 statsmodels 或其 pandas 和 patsy 依赖项提供的函数。 pandas 基于 numpy 数组提供丰富的数据结构和数据分析工具。 pandas.DataFrame 函数提供标记的 (可能异构的) 数据数组，类似于 R 的“data.frame”。pandas.read_csv 函数可用于将逗号分隔值文件转换为 DataFrame 对象。 patsy 是一个 Python 库，用于描述统计模型和使用类似于 R 的公式构建设计矩阵。本示例使用 API 接口。有关导入 API 接口 (statsmodels.api 和 statsmodels.tsa.api) 与直接从定义模型的模块导入之间的区别，请参见导入路径和结构。

我们下载了Guerry 数据集，这是一个用于支持 Andre-Michel Guerry 1833 年的《法国道德统计学论文》的历史数据集合。该数据集以逗号分隔值格式 (CSV) 形式托管在Rdatasets 存储库中。我们可以将文件下载到本地，然后使用 read_csv 加载它，但是 pandas 会自动为我们完成所有这些操作安装 statsmodels 最简单的方法是将其作为 Anaconda 发行版的一部分进行安装，Anaconda 是一个跨平台的数据分析和科学计算发行版。对于大多数用户来说，这是推荐的安装方法。 statsmodels 可通过 Anaconda 提供的 conda 获取。可以使用以下命令安装最新版本：对于 Windows 用户，偶尔会在这里提供非官方的最新二进制文件（wheels）。我们不经常发布，但我们源代码的主分支通常适合日常使用。您可以从我们的 github 仓库获取最新的源代码。或者，如果您已安装 git，则可以使用以下命令：您需要安装 C 编译器才能构建 statsmodels。如果您是从 github 源代码而不是源代码发行版构建，那么您还需要 Cython。您可以按照以下说明为 Windows 设置 C 编译器。

This very simple case-study is designed to get you up-and-running quickly with statsmodels. Starting from raw data, we will show the steps needed to estimate a statistical model and to draw a diagnostic plot. We will only use functions provided by statsmodels or its pandas and patsy dependencies. After installing statsmodels and its dependencies, we load a few modules and functions: pandas builds on numpy arrays to provide rich data structures and data analysis tools. The pandas.DataFrame function provides labelled arrays of (potentially heterogenous) data, similar to the R “data.frame”.

The pandas.read_csv function can be used to convert a comma-separated values file to a DataFrame object. patsy is a Python library for describing statistical models and building Design Matrices using R-like formulas. This example uses the API interface. See Import Paths and Structure for information on the difference between importing the API interfaces (statsmodels.api and statsmodels.tsa.api) and directly importing from the module that defines the model. 本页解释如何通过提交补丁、统计测试、新模型或示例来为 statsmodels 的开发做出贡献。 statsmodels 在 Github 上使用 Git 版本控制系统进行开发。

指定使用的 statsmodels 版本。您可以使用 sm.version.full_version 来完成此操作如果问题似乎涉及其他依赖项，还包括 sm.show_versions() 的输出首先，查看使用 statsmodels 代码部分以了解 Git 版本控制系统的介绍。为了被考虑纳入 statsmodels，数据集必须是公共领域的，在 BSD 兼容的许可下分发，或者我们必须获得原始作者的许可。尼罗河数据测量了 1871 年至 1970 年阿斯旺尼罗河的流量。数据来自 Cobb (1978) 的论文。 **步骤 2**：添加 datasets/nile/nile.csv 和一个新文件 datasets/__init__.py，其中包含

**步骤 3**：如果 nile.csv 是原始数据的转换/清理版本，请创建一个 nile/src 目录并将原始数据包含在其中。在 nile 案例中，此步骤不是必需的。 **步骤 4**：将 datasets/template_data.py 复制到 nile/data.py。通过填写 COPYRIGHT、TITLE、SOURCE、DESCRSHORT、DESCLONG 和 NOTE 的字符串来编辑 nile/data.py。

关于 Statsmodels Statsmodels 0 14 4 Statsmodels 文档

People Also Search

The Models Module Of Scipy.stats Was Originally Written By Jonathan

如果您遇到错误或意外行为，请在 Issue Tracker 上报告。使用 Show_versions 命令列出 Statsmodels 及其依赖项的已安装版本。 Google Www.google.com

The Stability Of A New Feature Depends On How Much

该 Rdatasets 项目提供对 R 的核心数据集包以及许多其他常用 R 包中可用数据集的访问。所有这些数据集都可通过使用 Get_rdataset 函数供

查看 Dir(results) 以查看可用的结果。属性在 Results.doc 中描述，结果方法有自己的文档字符串。 Seabold, Skipper 和 Josef Perktold。

关于 Statsmodels Statsmodels 0 14 4 Statsmodels 文档

People Also Search

The Models Module Of Scipy.stats Was Originally Written By Jonathan

如果您遇到错误或意外行为，请在 Issue Tracker 上报告。使用 Show_versions 命令列出 Statsmodels 及其依赖项的已安装版本。 Google Www.google.com

The Stability Of A New Feature Depends On How Much

该 Rdatasets 项目 提供对 R 的核心数据集包以及许多其他常用 R 包中可用数据集的访问。所有这些数据集都可通过使用 Get_rdataset 函数供

查看 Dir(results) 以查看可用的结果。属性在 Results.__doc__ 中描述，结果方法有自己的文档字符串。 Seabold, Skipper 和 Josef Perktold。

该 Rdatasets 项目提供对 R 的核心数据集包以及许多其他常用 R 包中可用数据集的访问。所有这些数据集都可通过使用 Get_rdataset 函数供

查看 Dir(results) 以查看可用的结果。属性在 Results.doc 中描述，结果方法有自己的文档字符串。 Seabold, Skipper 和 Josef Perktold。